问题:
I'm trying to read in words from a file. I need to count the words, lines, and characters in the text file. The word count should only include words (containing only alphab...
可以将文章内容翻译成中文,广告屏蔽插件会导致该功能失效:
问题:
I'm trying to read in words from a file. I need to count the words, lines, and characters in the text file. The word count should only include words (containing only alphabetic letters, no punctuation, spaces, or non-alphabetic characters). The character count should only include the characters inside those words.
This is what I have so far. I'm unsure of how to count the characters. Every time I run the program, it jumps to the catch mechanism as soon as I enter the file name (and it should have no issues with the file path, as I've tried using it before). I tried to create the program without the try/catch to see what the error was, but it wouldn't work without it.
import java.util.Scanner;
import java.util.StringTokenizer;
import java.io.*;
public class WordCount
{
public static void main(String[] args)
{
Scanner userInput = new Scanner(System.in);
try {
// Input file
System.out.println("Please enter the name of the file.");
String fileName = userInput.next( );
File newFile = new File("C:/Users/garre/OneDrive/Desktop/" + fileName);
// Word count, line count, and character count variables; temporary string variable
int wordC = 0;
int lineC = 0;
int charC = 0;
String tempo;
// Text file scanner
Scanner fileScan = new Scanner(newFile);
while (fileScan.hasNextLine( )) {
lineC++;
tempo = fileScan.nextLine( );
wordC += new StringTokenizer(tempo, "[.,:;()?!"\s]+").countTokens( );
System.out.println("Lines: " + lineC + "nWords: " + wordC);
}
}
catch (IOException ex1) {
System.out.println("Error.");
System.exit(0);
}
}
}
Why is it jumping to the catch function when I enter the file name? How can I fix this program to properly count words, lines, and characters in the text file?
回答1:
I tried your code but I didn't receive any exception here. However, I suspect that when you input the file name, maybe you forgot the extension of the file.
回答2:
I don't get any exception with your code if I give a proper file name. As for reading the number of character, you should modify the logic a little bit. Instead of directly concatenating the number of words count, you should create a new instance of StringTokenizer st = new StringTokenizer(tempo, "[ .,:;()?!]+");
and iterate through all the token and sum the length of each token. This should give you the number of characters. Something like below
while (fileScan.hasNextLine()) {
lineC++;
tempo = fileScan.nextLine();
StringTokenizer st = new StringTokenizer(tempo, "[ .,:;()?!]+");
wordC += st.countTokens();
while(st.hasMoreTokens()) {
String stt = st.nextToken();
System.out.println(stt); // Displaying string to confirm that like is splitted as I expect it to be
charC += stt.length();
}
System.out.println("Lines: " + lineC + "nWords: " + wordC+" nChars: "+charC);
}
Note: Escaping character with StringTokenizer
will not work. i.e. you would expect that \s
should delimit with any whitespace character but it will instead delimit based on literal character s
. If you want to escape a character, I suggest you to use java.util.Pattern
and java.util.Matcher
and use it matcher.find()
to idenfity words and characters
回答3:
You probably forgot the file extension while giving input, but there is a much simpler way of doing this. You also mention you don't know how to count the characters. You can try something like this:
import java.util.Scanner;
import java.util.StringTokenizer;
import java.io.*;
import java.util.stream.*;
public class WordCount
{
public static void main(String[] args)
{
Scanner userInput = new Scanner(System.in);
try {
// Input file
System.out.println("Please enter the name of the file.");
String content = Files.readString(Path.of("C:/Users/garre/OneDrive/Desktop/" + userInput.next()));
System.out.printf("Lines: %dnWords: %dnCharacters: %d",content.split("n").length,Stream.of(content.split("[^A-Za-z]")).filter(x -> !x.isEmpty()).count(),content.length());
}
catch (IOException ex1) {
System.out.println("Error.");
System.exit(0);
}
}
}
Going through the code
import java.util.stream.*;
Note we use the streams package, for filtering out empty strings while finding words. Now let's skip forward a bit.
String content = Files.readString(Path.of("C:/Users/garre/OneDrive/Desktop/" + userInput.next()));
The above part gets all of the text in the file and stores it as a string.
System.out.printf("Lines: %dnWords: %dnCharacters: %d",content.split("n").length,Stream.of(content.split("[^A-Za-z]")).filter(x -> !x.isEmpty()).count(),content.length());
Okay, this is a long line. Let's break it down.
"Lines: %dnWords: %dnCharacters: %d"
is a format string, where each %d
is replaced with the corresponding argument in the printf
function. The first %d
will be replaced by content.split("n").length
, which is the number of lines. We get the number of lines by splitting the string.
The second %d
is replaced by Stream.of(content.split("[^A-Za-z]")).filter(x -> !x.isEmpty()).count()
. Stream.of
creates a stream from an array, and the array is an array of strings after you split on anything that is non-alphabetic (you said words are anything that are non-alphabetic). Next, we filter all the empty values out, since String.split
keeps in empty values. The .count()
is self-explanatory, takes the amount of words left after filtering.
The third and last %d
is the simplest. It is replaced by the length of the string. content.length()
should be self-explanatory.
I left your catch
block intact, but I feel like the System.exit(0)
is a bit redundant.