计算每个单词的出现次数
发布时间:2020-12-16 07:16:53 所属栏目:百科 来源:网络整理
导读:我正在尝试计算函数countWords中每个单词的出现次数我相信我在函数中正确地启动了for循环但是如何将数组中的单词进行比较并计算它们然后删除重复项?它不像斐波那契系列,还是我弄错了?此外,int n的值为756,因为数组中有多少个单词,而wordsArray是数组中的元
我正在尝试计算函数countWords中每个单词的出现次数我相信我在函数中正确地启动了for循环但是如何将数组中的单词进行比较并计算它们然后删除重复项?它不像斐波那契系列,还是我弄错了?此外,int n的值为756,因为数组中有多少个单词,而wordsArray是数组中的元素.
#include <stdio.h> #include <string.h> #include <stdlib.h> #include <ctype.h> int *countWords( char **words,int n); int main(int argc,char *argv[]) { char buffer[100]; //Maximum word size is 100 letters FILE *textFile; int numWords=0; int nextWord; int i,j,len,lastChar; char *wordPtr; char **wordArray; int *countArray; int *alphaCountArray; char **alphaWordArray; int *freqCountArray; char **freqWordArray; int choice=0; //Check to see if command line argument (file name) //was properly supplied. If not,terminate program if(argc == 1) { printf ("Must supply a file name as command line argumentn"); return (0); } //Open the input file. Terminate program if open fails textFile=fopen(argv[1],"r"); if(textFile == NULL) { printf("Error opening file. Program terminated.n"); return (0); } //Read file to count the number of words fscanf(textFile,"%s",buffer); while(!feof(textFile)) { numWords++; fscanf(textFile,buffer); } printf("The total number of words is: %dn",numWords); //Create array to hold pointers to words wordArray = (char **) malloc(numWords*sizeof(char *)); if (wordArray == NULL) { printf("malloc of word Array failed. Terminating program.n"); return (0); } //Rewind file pointer and read file again to create //wordArray rewind(textFile); for(nextWord=0; nextWord < numWords; nextWord++) { //read next word from file into buffer. fscanf(textFile,buffer); //Remove any punctuation at beginning of word i=0; while(!isalpha(buffer[i])) { i++; } if(i>0) { len = strlen(buffer); for(j=i; j<=len; j++) { buffer[j-i] = buffer[j]; } } //Remove any punctuation at end of word len = strlen(buffer); lastChar = len -1; while(!isalpha(buffer[lastChar])) { lastChar--; } buffer[lastChar+1] = ' '; //make sure all characters are lower case for(i=0; i < strlen(buffer); i++) { buffer[i] = tolower(buffer[i]); } //Now add the word to the wordArray. //Need to malloc an array of chars to hold the word. //Then copy the word from buffer into this array. //Place pointer to array holding the word into next //position of wordArray wordPtr = (char *) malloc((strlen(buffer)+1)*sizeof(char)); if(wordPtr == NULL) { printf("malloc failure. Terminating programn"); return (0); } strcpy(wordPtr,buffer); wordArray[nextWord] = wordPtr; } //Call countWords() to create countArray and replace //duplicate words in wordArray with NULL countArray = countWords(wordArray,numWords); if(countArray == NULL) { printf("countWords() function returned NULL; Terminating programn"); return (0); } //Now call compress to remove NULL entries from wordArray compress(&wordArray,&countArray,&numWords); if(wordArray == NULL) { printf("compress() function failed; Terminating program.n"); return(0); } printf("Number of words in wordArray after eliminating duplicates and compressing is: %dn",numWords); //Create copy of compressed countArray and wordArray and then sort them alphabetically alphaCountArray = copyCountArray(countArray,numWords); freqCountArray = copyCountArray(alphaCountArray,numWords); int *countWords( char **wordArray,int n) { return NULL; int i=0; int n=0; for(i=0;i<n;i++) { for(n=0;n<wordArray[i];n++) { } } } 解决方法
假设您希望countWords的返回值是一个整数数组,每个唯一的单词都有字数,您需要有一个双循环.一个循环遍历整个数组,第二个循环遍历数组的其余部分(在当前单词之后),寻找重复数据.
你可以做这样的伪代码: Allocate the return array countArray (n integers) Loop over all words (as you currently do in your `for i` loop) If the word at `i` is not null // Check we haven't already deleted this word // Found a new word Set countArray[i] to 1 Loop through the rest of the words e.g. for (j = i + 1; j < n; j++) If the word at j is not NULL and matches the word at i (using strcmp) // Found a duplicate word Increment countArray[i] (the original word's count) // We don't want wordArray[j] anymore,so Free wordArray[j] Set wordArray[j] to NULL Else // A null indicates this was a duplicate,set the count to 0 for consistency. Set countArray[i] to 0 Return wordArray (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |