【试题描述】把一个数组最开始的若干个元素搬到末尾,我们称之为数组的旋转。输入一个递增排序的数组的一个旋转,输出旋转数组的小元素。例如数组{3,4,5,1,2}为{1,2,3,4,5}的一个旋转,该数组的最小元素为1.
【试题来源】未知
【参考代码】
1 |
|
【试题描述】把一个数组最开始的若干个元素搬到末尾,我们称之为数组的旋转。输入一个递增排序的数组的一个旋转,输出旋转数组的小元素。例如数组{3,4,5,1,2}为{1,2,3,4,5}的一个旋转,该数组的最小元素为1.
【试题来源】未知
【参考代码】
1 | #include <iostream> |
【试题描述】用两个栈实现一个队列。队列声明如下,请实现它的两个函数appendTail,deleteHead,分别完成在队列尾部插入节点和在头部删除节点。
【试题来源】未知
【参考代码】
1 | #include <iostream> |
最近准备重温 C++,正好看到有关类和占用空间的问题,于是整理了下。先上代码
1 | #include <iostream> |
结果:
1 | ClassA:1 |
【分析】 1)ClassA 里面没有定义任何变量和函数,本应该是 0
的,但是声明变量需要占用一个字节,因此是 1。
2)ClassB
里面仅仅定义了构造函数和析构函数。调用构造函数和析构函数只需要知道地址即可,而这些函数的地址只与类型相关,而与实例无关,编译器不会因为这两个函数在实例中添加任何信息。
3)ClassC
中仅有一个虚函数。C++编译器一旦发现类型中有虚函数,就会为该类型生成虚函数表,并在该类型的每个实例中添加一个指向虚函数表的指针。
PS:编译器 GCC 4.6.2
【试题描述】请实现一个函数,把字符串中的每个空格替换成%20。例如输入"We are happy.",则输出"We%20are%20happy."。(不能使用任何字符串操作函数)
【试题来源】未知
【试题分析】由于不能使用字符串操作函数,因此只能一个字符一个字符的进行处理。 首先先遍历一遍字符串,统计得到所有的空格,那么替换后的字符串长度应 为len(str) + numOfBlank * 2。然后从后往前面修改字符串。算法时间复杂度O(n)。
【参考代码】
1 | #include <cstdio> |
【试题描述】输入二叉树的前序遍历和中序遍历的结果,请重建出该二叉树。假设遍历结果中都不包含重复数字。例如输入前序遍历序列{1,2,4,7,3,5,6,8}和中序遍历{4,7,2,1,5,3,8,6},则重建出该二叉树,并以后续遍历输出。
【参考代码】
1 | #include <iostream> |
几乎所有的博客都可以在线阅读,或者通过 RSS 订阅源进行阅读。RSS 订阅源是一个包含博客及其所有文章条目信息的简单的 XML 文档。 程序中使用了 feedparser 第三方模块,可以轻松地从任何 RSS 或 Atom 订阅源中得到标题、链接和文章的条目。完整代码如下:
1 | ''' |
1. Suppose that a Selection Sort of 80 items has completed 32
iterations of the main loop. How many items are now guaranteed to be in
their final spot (never to be moved again)?
A) 16 B) 31 C) 32 D) 39 E) 40
2. Which Synchronization mechanism(s) is/are used to avoid
race conditions among processes/threads in operating
systems?
A) Mutex B) Mailbox C) Semaphore D) Local procedure call
3. There is a sequence of n numbers 1, 2, 3,.., n and a stack
which can keep m numbers at most. Push the n numbers into the stack
following the sequence and pop out randomly. Suppose n is 2 and m is 3,
the output sequence may be 1, 2 or 2, 1, so we get 2 different
sequences. Suppose n is 7 and m is 5, please choose the output sequences
of the stack:
A) 1, 2, 3, 4, 5, 6, 7
B) 7, 6, 5, 4, 3, 2, 1
C) 5, 6, 4, 3, 7, 2, 1
D) 1, 7, 6, 5, 4, 3, 2
E) 3, 2, 1, 7, 5, 6, 4
4. What is the result of binary number 01011001 after
multiplying by 0111001 and adding 1101110?
A) 0001 0100 0011 1111
B) 0101 0111 0111 0011
C) 0011 0100 0011 0101
5. What is output if you compile and execute the following code?
1 | void main() |
6. Which of following C++ code is correct?
A) int f() { int a = new int(3); return a; } B) int f() {
int a[3] = {1, 2, 3}; return a; } C) vector
7. Given that the 180-degree rotated image of a 5-digit
number is another 5-digit number and the difference between the numbers
is 78633, what is the original 5-digit number?
A) 60918 B) 91086 C) 18609 D) 10968 E) 86901
8. Which of the following statements are true?
A) We can create a binary tree from given inorder and preorder traversal
sequences.
B) We can create a binary tree from given preorder and postorder
traversal sequences.
C) For an almost sorted array, insertion sort can be more effective than
Quicksort.
D) Suppose T(n) is the runtime of resolving a problem with n elements,
T(n) = Θ(1) if n = 1; T(n) = 2T(n/2) + Θ(n) if > 1; so T(n) is Θ(n
log n).
E) None of the above.
9. Which of the following statements are true?
A) Insertion sort and bubble sort are not effcient for large data
sets.
B) Quick sort makes O(n^2) comparisons in the worst case.
C) There is an array: 7, 6, 5, 4, 3, 2, 1. If using selection sort
(ascending), the number of swap operation is 6.
D) Heap sort uses two heap operations: insertion and root
deletion.
E) None of above.
10. Assume both x and y are integers, which one of the
followings returns the minimum of the two integers?
A) y ^ ((x ^ y) & ~(x < y))
B) y ^(x ^ y)
C) x ^ (x ^ y)
D) (x ^ y) ^ (y ^ x)
E) None of the above
11. The Orchid Pavilion (兰亭集序) is well known as the top
of "行书" in history of Chinese literature. The most fascinating
sentence "Well I know it is a lie to say that life and death is the same
thing, and that longevity and early death make no difference! Alas!"
("周知一死生为虚诞,齐彭殇为妄作。") By counting the characters of the
whole content (in Chinese version), the result should be 391 (including
punctuation). For these characters written to a text file, please select
the possible file size without any data corrupt.
A) 782 bytes in UTF-16 encoding
B) 784 bytes in UTF-16 encoding
C) 1173 bytes in UTF-8 encoding D) 1176 bytes in UTF-8 encoding
E) None of the above
12. Fill the blanks inside class definition
1 | class Test |
Running result: 0 20 1 20
A) static/const
B) const/static
C) --/static
D) const static/static
E) None of the above
13. A 3-order B-tree has 2047 key words, what is the maximum
height of the tree?
A) 11 B) 12 C) 13 D) 14
14. In C++, which of the following keyword(s) can be used on
both a variable and a function?
A) static B) virtual C) extern D) inline E) const
15. What is the result of the following program?
1 | char* f(char *str, char ch) |
16. Consider the following definition of a recursive function, power, that will perform exponentiation.
1 | int power(int b, int e) |
Asymptotically (渐进地) in terms of the exponent e, the number of calls to power that occur as a result of the call power(b, e) is* A) logarithmic B) linear C) quadratic D) exponential
17. Assume a full deck of cards has 52 cards, 2 black suits
(spade and club) and 2 red suits (diamond and heart). If you are given a
full deck, and a half deck (with 1 red suit and 1 black suit), what's
the possibility for each one getting 2 red cards if taking 2
cards?
A) 1/2, 1/2 B) 25/102, 12/50 C) 50/51, 24/25 D) 25/51, 12/25 E) 25/51,
1/2
18. There is a stack and a sequence of n numbers (i.e., 1, 2,
3, ..., n). Push the n numbers into the stack following the sequence and
pop out randomly. How many different sequences of the number we may get?
Suppose n is 2, the output sequence may be 1, 2 or 2, 1, so we get 2
different sequences.
A) C_2n^n
B) C_2n^n - C_2n^(n + 1)
C) ((2n)!) / (n + 1)n!n!
D) n!
E) None of the above
19. Longest Increasing Subsequence (LIS) means a sequence
containing some elements in another sequence by the same order, and the
values of elements keeps increasing. For example, LIS of {2, 1, 4, 2, 3,
7, 4, 6} is {1, 2, 3, 4, 6}, and its LIS length is 5. Considering an
array with N elements, what is the lowest time and space complexity to
get the length of LIS?
A) Time: N^2, Space: N^2
B) Time: N^2, Space: N
C) Time: NlogN, Space: N
D) Time: N, Space: N
E) Time: N, Space: C
20. What is the output of the following piece of C++ code?
1 | #include <iostream> |
Ubuntu: sudo apt-get install ttf-arphic-uming
Fedora:sudo yum install cjkuni-uming-fonts
然后打开http://web.sanguosha.com/ 测试,如果不行请重启电脑。
PS:本人测试后发现安装字体只能解决 Firefox 下面的乱码,但无法解决 Chrome 浏览器下面的问题。
参考资料: http://hi.baidu.com/gerald07/item/fb2234cfbfef9f21e80f2e77 http://hi.baidu.com/siweng23/blog/item/aa9dfcfbd4f4564c242df2ec.html http://blog.csdn.net/devplus/article/details/6347492
协同过滤技术可以分为三类:基于用户(User-based)的协同过滤;基于项目(Item-based)的协同过滤;基于模型(Model-based)的协同过滤。
基于用户(User-based)的协同过滤
用相似统计的方法得到具有相似爱好或者兴趣的相邻用户,所以称之为以用户为基础(User-based)的协同过滤或基于邻居的协同过滤(Neighbor-based Collaborative Filtering)。方法步骤:
1.收集用户资讯
收集可以代表用户兴趣的资讯。一般的网站系统使用评分的方式或是给予评价,这种方式被称为「主动评分」。另外一种是「被动评分」,是根据用户的行为模式由系统代替用户完成评价,不需要用户直接打分或输入评价资料。电子商务网站在被动评分的资料获取上有其优势,用户购买的商品记录是相当有用的资料。
2.最近邻搜索(Nearest neighbor search, NNS)
以用户为基础(User-based)的协同过滤的出发点是与用户兴趣爱好相同的另一组用户,就是计算两个用户的相似度。例如:寻找n个和A有相似兴趣用户,把他们对M的评分作为A对M的评分预测。一般会根据资料的不同选择不同的演算法,目前较多使用的相似度演算法有Person Correlation Coefficient、Cosine-based Similarity、Adjusted Cosine Similarity。
3.产生推荐结果
有了最近邻集合,就可以对目标用户的兴趣进行预测,产生推荐结果。依据推荐目的的不同进行不同形式的推荐,较常见的推荐结果有Top-N推荐和关联推荐。 Top-N推荐是针对个体用户产生,对每个人产生不一样的结果,例如:透过对A用户的最近邻用户进行统计,选择出现频率高且在A用户的评分项目中不存在的,作为推荐结果。关联推荐是对最近邻用户的记录进行关联规则(association rules)挖掘。
基于项目(Item-based)的协同过滤
以项目为基础的协同过滤方法有一个基本的假设:“能够引起用户兴趣的项目,必定与其之前评分高的项目相似”,透过计算项目之间的相似性来代替用户之间的相似性。方法步骤:
1.收集用户资讯
同以用户为基础(User-based)的协同过滤。
2.针对项目的最近邻搜索
先计算己评价项目和待预测项目的相似度,并以相似度作为权重,加权各已评价项目的分数,得到待预测项目的预测值。例如:要对项目A和项目B进行相似性计算,要先找出同时对A和B打过分的组合,对这些组合进行相似度计算,常用的演算法同以用户为基础(User-based )的协同过滤。
3.产生推荐结果
以项目为基础的协同过滤不用考虑用户间的差别,所以精度比较差。但是却不需要用户的历史资料,或是进行用户识别。对于项目来讲,它们之间的相似性要稳定很多,因此可以离线完成工作量最大的相似性计算步骤,从而降低了线上计算量,提高推荐效率,尤其是在用户多于项目的情形下尤为显著。
基于模型(Model-based)的协同过滤
以用户为基础(User-based)的协同过滤和以项目为基础(Item-based)的协同过滤统称为以记忆为基础(Memory based)的协同过滤技术,他们共有的缺点是资料稀疏,难以处理大资料量影响即时结果,因此发展出以模型为基础的协同过滤技术。以模型为基础的协同过滤(Model-based Collaborative Filtering)是先用历史资料得到一个模型,再用此模型进行预测。以模型为基础的协同过滤广泛使用的技术包括Latent Semantic Indexing、 Bayesian Networks…等,根据对一个样本的分析得到模型。
本文构建了一个简单的推荐系统,使用的数据是真实的数据,叫作MovieLens,来自University of Minnesota‘s GroupLens项目组。代码以Python作为实现语言,使用版本为Python2.7。
loadMovieData:用于数据的读取。userData指的是以userId为键构建的电影评分列表。movieData值的是以movieId为键构建的电影评分列表。
euclidean:用于计算Eucidean距离系数
pearson:用于计算Pearson相关系数
getSimilarItems:计算出movieData中每一项相似度最大的n项
getRecommendationsItems:对于某个user取得推荐结果
代码如下:
1 | ''' |