I am trying to parse a log a file and store it in a CSV file. Here is a sample line below:
218.1.111.50 - - [13/Mar/2005:10:36:11 -0500] "GET http://www.yahoo.com/ HTTP/1.1" 403 2898 "-" "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)"
For this, I am using the Apach commons CSV library. The problem is that some fields have in the special character ;
their value, and they get interpreted as a separator.
If we look for example at the field value Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)
. This single field is assigned to 3 different values because of the ;
.
I don't know the ideal method to go around this. Please see below, a snapchot of the code related to the library I use :
CSVPrinter printer = new CSVPrinter(writer, CSVFormat.DEFAULT
.withHeader(HEADERS));
//
//
Matcher m = p.matcher(line);
Date date=formatter.parse(m.group("Time"));
try {
printer.printRecord(date.getMonth(), date.getDate(), date.getHours(), date.getMinutes(), date.getSeconds(), m.group("NetworkSrcIpv4"),
m.group("ApplicationHttpStatus"),m.group("ApplicationLen"),m.group("ApplicationHttpUserAgent"),
m.group("ApplicationHttpQueryString"));
printer.flush();
} catch (IOException e) {
e.printStackTrace();
}
//
Is there any possibility of automatically ignoring the ;
, or perhaps replacing them with some values which won't affect the desired result? Is there any options I might add the my CSVprinter
?
Thank you for your feedback.