Register   Login   About   Study   Enterprise   Share
Internet / AI Technology University (ITU/AITU)
Fast Login - available after registration







|

Top Links: >> 80. Technology >> Internet Technology Summit Program >> 1. Java Introduction
Current Topic: 1.5. Text Processing and Collections
Sub-Topics: 1.5.1. Character and other Java Wrapper Classes | 1.5.2. Formatting output | 1.5.3. Parsing text with patterns | 1.5.4. Collections: ArrayList, HashMap, Hashtable and more
-- Scroll to check for more content below...
You have a privilege to create a quiz (QnA) related to this subject and obtain creativity score...
The basic methods of Processing Text in Java are located in the String class.

Java developers used these basic methods to provide more sophisticated methods and even frameworks for this purpose.
What is a String? It is an object represented by a set characters in between quotes. For example, "Internet Technology School" is a String object with a set of characters. At the same time a String object has very important methods to handle this set of characters.

Eclipse helps you to see all these methods and choose one. Open Eclipse and create a new project week3 and a new package day9.text. Create a new class Stringer. Provide a class header with the plan to write a set of text processing utilities. Start with the main() method and type several lines as displayed below.

/**
* Test several methods of the String class and (later) test Stringer utility methods
* @param args
*/
public static void main(String[] args) {
String s = "Internet Technology School";
char firstCharacter = s.charAt(0); // the first character is "I"
System.out.println(firstCharacter);
}


We created an object of the String class and named this object s. When we type s. (s dot) Eclipse will display for us all the methods that belong to this object. Our goal is to find a method that would return a single character at a selected position. Fortunately, this method is the first one displayed by Eclipse. We just click on this method to select it and provide the index 0 to indicate that we have a lucrative interest in the very first character of the string s.

You can (this is recommended) select one by one several more methods of the String class and read about them. Yes, there are so many methods that it is impossible to remember all of them. But Eclipse always helps you to view all the methods of any object and you just need to have a good idea about what method you need. Usually the names of Java methods are self-explanatory. This rule should apply to the methods you create in your programs!
Let us still go over most often used methods of the String class.
s.endsWith(anotherString) ? returns true if a String s ends with the value of another string.


if(s.endsWith("School")) {
// returns true and will perform this block
}
if(s.endsWith("school")) {
// returns false and will not perform this block because school is not School
}
if(!s.endsWith("school")) {
// returns true and will perform this block
}
A similar method is startsWith(anotherString)
if(s.startsWith("School")) {
// returns false and will not perform this block
}
if(s.startsWith("Internet")) {
// returns true and will perform this block
}


Imagine that our program is searching over a big text and looking for a specific pattern. For example, the program might scan the Yahoo ? Weather page looking for highest temperature of the day at your location. The value of the temperature can be found between two patterns: and ° (for the degree symbol).

Of course, someone will need to do manual analysis first of the page at https://weather.yahoo.com and find these patterns. But after this is done your program can automatically do this for you and send you a signal when the temperature is too high or too cold.
In your small program you will use the s.indexOf(pattern) method that returns the index of the pattern or -1, if not found. We will also use several other methods:

int indexOfBeginning = bigText.indexOf("");

s.substring(indexOfBeginning) ? returns a part of the string starting with the index of beginning
s.substring(indexOfBeginning, indexOfEnd) ? returns a part of the string starting with the index of beginning and ending with index of end

In the example below, the program directly reads the web page with the expanded utility IOMaster.readTextFile. We will learn how to expand this utility later. To use IOMaster utilities in this exercise, copy this class to the current package. We will also learn later how to establish a library, so we will not need copying same classes from a package to package. So far, you can read manually this web page with your browser. Then do right mouse click on the page and select the option View Source. Copy/paste this source into a WordPad file and save it as the c:/its-resources/weather.txt.

Then you can replace in the source below the line:
IOMaster.readTextFile("https://weather.yahoo.com");

With

IOMaster.readTextFile("c:/its-resources/weather.txt");
Another useful method is length(). This is similar to the length property of an array. But in the String class it is a method, not a property. We use this method in the following line:

String temperature = bigText.substring(indexOfBeginning + patternBeforeValue.length(), indexOfEnd);

We added the length() of the patternBeforeValue to get only digital value as a result of our small program in the main() method.


/**
* Test several methods of the String class and (later) Stringer utility methods
* @param args
*/
public static void main(String[] args) {
// find the highest temperature of the day in the https://weather.yahoo.com page
// Open this web page and with right-mouse click - View Source -
// copy/paste the content of the page into c:/its-resources/weather.txt - file
// (We will learn later how to read web pages with similar Java code...)
// Then read this file and do text processing, find the highest temperature of the day
String bigText = IOMaster.readTextFile("c:/its-resources/weather.txt");
String patternBeforeValue = ""; // Note escape characters before the quote characters
// The fancy character of degree actually is represented in the HTML code by 4 characters: &, d, e, g
String patternAfterValue = "°";
int indexOfBeginning = bigText.indexOf(patternBeforeValue);
if(indexOfBeginning >= 0) { // check if found
int indexOfEnd = bigText.indexOf(patternAfterValue, indexOfBeginning); // start looking from the indexOfBeginning
if(indexOfEnd >= 0) { // check if found
// note that we added the length() of the patternOfBeginning to get only digital value as a result
String temperature = bigText.substring(indexOfBeginning + patternBeforeValue.length(), indexOfEnd);
System.out.println("The highest temperature today is "+temperature);
}
}
}

We can change a String to all uppercase using the toUpperCase() method.
String s = "Internet".toUpperCase(); // the result is INTERNET
Or
String s = "Internet".toLowerCase(); // the result is internet
Was it clear so far?


To compare two strings, use the equals() method or equalsIgnoreCase().

But NEVER use "==" operator.
String temperatureString = "69"; // string!
int digitTemperature = Integer.parseInt(temperatureString); // result is a digit
if(digitTemperature == 69) // correct!

But
if(temperatureString == "69") // wrong
if(temperatureString.equals("69")) // correct!

There are so many String methods to talk about all of them, but the method split(pattern) is just too important. Imagine that you have a huge text as a single String object and you would like your program to walk through every line. The split method allows you to split a single object into the array of lines.
String[] lines = text.split(System.lineSeparator()); // For Windows it is \r\n (two characters) and for Unix it is \n (just one character)

Or consider another scenario. Your program should walk through a web page with many tables. In HTML the table has its specific table tags: the table beginning and the table end patterns. The program can split this HTML text into the parts, where each part has just one table.

String[] tableParts = html.split(tableEndPattern);


Assignments:
1. In Eclipse modify the class Stringer in the following way:
Rename the main() method into the testStringMethods() with no arguments.
This will look like:

public static void testStringMethods() {
// place inside this method all one line samples of what can be done with the String
// for example, using toLowerCase();
String s = "Internet".toLowerCase(); // the result is internet
// after each sample like that display the result
System.out.println(s);
// more samples like that

}

Read the Nasdaq web page - view source and store the source in a text file.
The path to the file: "c:/its-resources/Nasdaq.txt"
Open this file in a Notepad and check for several stock index names, like "FB", "NQGM", etc.

Create another method: public static void lookForNasdaqMainIndex(String indexName).
In this method provide a header with your plan; the plan below is based on analysis of the source of the Nasdaq page.

/**
* display the line from the web page type, which includes information about the stock index provided as an argument to the method.
* Read the Nasdaq web page from the text file "c:/its-resources/Nasdaq.txt"
* Split text into array of lines: String[] lines = text.split(System.lineSeparator());
* In a for loop check each line on the pattern that includes stock index name
* String pattern = "symbol="+indexName+"&";
* for(int i=0; i < lines.length; i++) {
String line = lines[i];
if(line.contains(pattern)) {
// When line with the pattern is found make another loop to display 5 lines, which will include this stock data
for(int j=0; j < 5; j++) {
System.out.println(lines[i+j]);
}
}
*/

Follow the plan and implement this method; copy each comment line before implementation line, similar as above.

Create the main method to test both methods: testStringMethods() and lookForNasdaqMainIndex("NQGM")

public static void main(String[] args) {
testStringMethods();
lookForNasdaqMainIndex("NQGM");
System.exit(0);
}

2. Go to the web page:
http://www.nasdaq.com/markets/indices/major-indices.aspx

You should see on this page the table with the following columns:

Symbol | Name| Index Value | Change Net / % | High | Low

View-source and store the file as c:/its-resources/Nasdaq.txt

Run the Stringer class with your new main() method that will call and test both methods:
testStringMethods();
lookForNasdaqMainIndex("NQGM");

3. Extra points: look again at the Nasdaq page in the Notepad, note that the page includes a lot of HTML tables.
They start with the tag ""
Use the split() method to split the Nasdaq web page into the table parts.

4. (Optional for extra points) Read on text processing with Regular Expressions

5. Create at least one similar QnA with the correct answer Run-time error and email to dean@ituniversity.us

6. These are useful links on the subject:

http://www.tutorialspoint.com/java/lang/java_lang_string.htm

https://www.youtube.com/watch?v=vW53w7me4AE

Search Google and Youtube for the better presentations on this subject, read, watch, select, review and discuss the best links by using the Discussion Link below.
1.5.textProcessing
<br/>	/**
<br/>	 * Test several methods of the String class and (later) test Stringer utility methods
<br/>	 * @param args
<br/>	 */
<br/>	public static void main(String[] args) {
<br/>		String s = "Internet Technology School";
<br/>		char firstCharacter = s.charAt(0); // the first character is "I"
<br/>		System.out.println(firstCharacter);
<br/>	}
<br/>


We created an object of the String class and named this object s. When we type s. (s dot) Eclipse will display for us all the methods that belong to this object. Our goal is to find a method that would return a single character at a selected position. Fortunately, this method is the first one displayed by Eclipse. We just click on this method to select it and provide the index 0 to indicate that we have a lucrative interest in the very first character of the string s.

You can (this is recommended) select one by one several more methods of the String class and read about them. Yes, there are so many methods that it is impossible to remember all of them. But Eclipse always helps you to view all the methods of any object and you just need to have a good idea about what method you need. Usually the names of Java methods are self-explanatory. This rule should apply to the methods you create in your programs!
Let us still go over most often used methods of the String class.
s.endsWith(anotherString) ? returns true if a String s ends with the value of another string.

<br/>if(s.endsWith("School")) {
<br/>  // returns true and will perform this block
<br/>}
<br/>if(s.endsWith("school")) {
<br/>  // returns false and will not perform this block because school is not School
<br/>}
<br/>if(!s.endsWith("school")) {
<br/>  // returns true and will perform this block
<br/>}
<br/>A similar method is <b>startsWith(anotherString)</b>
<br/>if(s.startsWith("School")) {
<br/>  // returns false and will not perform this block
<br/>}
<br/>if(s.startsWith("Internet")) {
<br/>  // returns true and will perform this block
<br/>}
<br/>


Imagine that our program is searching over a big text and looking for a specific pattern. For example, the program might scan the Yahoo ? Weather page looking for highest temperature of the day at your location. The value of the temperature can be found between two patterns: and ° (for the degree symbol).

Of course, someone will need to do manual analysis first of the page at https://weather.yahoo.com and find these patterns. But after this is done your program can automatically do this for you and send you a signal when the temperature is too high or too cold.
In your small program you will use the s.indexOf(pattern) method that returns the index of the pattern or -1, if not found. We will also use several other methods:

int indexOfBeginning = bigText.indexOf("");

s.substring(indexOfBeginning) ? returns a part of the string starting with the index of beginning
s.substring(indexOfBeginning, indexOfEnd) ? returns a part of the string starting with the index of beginning and ending with index of end

In the example below, the program directly reads the web page with the expanded utility IOMaster.readTextFile. We will learn how to expand this utility later. To use IOMaster utilities in this exercise, copy this class to the current package. We will also learn later how to establish a library, so we will not need copying same classes from a package to package. So far, you can read manually this web page with your browser. Then do right mouse click on the page and select the option View Source. Copy/paste this source into a WordPad file and save it as the c:/its-resources/weather.txt.

Then you can replace in the source below the line:
IOMaster.readTextFile("https://weather.yahoo.com");

With

IOMaster.readTextFile("c:/its-resources/weather.txt");
Another useful method is length(). This is similar to the length property of an array. But in the String class it is a method, not a property. We use this method in the following line:

String temperature = bigText.substring(indexOfBeginning + patternBeforeValue.length(), indexOfEnd);

We added the length() of the patternBeforeValue to get only digital value as a result of our small program in the main() method.

<br/>	/**
<br/>	 * Test several methods of the String class and (later) Stringer utility methods
<br/>	 * @param args
<br/>	 */
<br/>	public static void main(String[] args) {	
<br/>		// find the highest temperature of the day in the https://weather.yahoo.com page
<br/>                // Open this web page and with right-mouse click - View Source - 
<br/>                // copy/paste the content of the page into c:/its-resources/weather.txt - file
<br/>                // (We will learn later how to read web pages with similar Java code...)
<br/>                // Then read this file and do text processing, find the highest temperature of the day
<br/>		String bigText = IOMaster.readTextFile("c:/its-resources/weather.txt"); 
<br/>		String patternBeforeValue = "<span class=\"hi f w-up-arrow\">"; // Note escape characters before the quote characters
<br/>                // The fancy character of degree actually is represented in the HTML code by 4 characters: &, d, e, g
<br/>		String patternAfterValue = "°"; 
<br/>		int indexOfBeginning = bigText.indexOf(patternBeforeValue);
<br/>		if(indexOfBeginning >= 0) { // check if found
<br/>			int indexOfEnd = bigText.indexOf(patternAfterValue, indexOfBeginning); // start looking from the indexOfBeginning
<br/>			if(indexOfEnd >= 0) { // check if found
<br/>                          // note that we added the length() of the patternOfBeginning to get only digital value as a result
<br/>				String temperature = bigText.substring(indexOfBeginning + patternBeforeValue.length(), indexOfEnd);
<br/>				System.out.println("The highest temperature today is "+temperature);
<br/>			}
<br/>		}
<br/>	}
<br/>

We can change a String to all uppercase using the toUpperCase() method.
String s = "Internet".toUpperCase(); // the result is INTERNET
Or
String s = "Internet".toLowerCase(); // the result is internet





Was it clear so far?



To compare two strings, use the equals() method or equalsIgnoreCase().

But NEVER use "==" operator.
String temperatureString = "69"; // string!
int digitTemperature = Integer.parseInt(temperatureString); // result is a digit
if(digitTemperature == 69) // correct!

But
if(temperatureString == "69") // wrong
if(temperatureString.equals("69")) // correct!

There are so many String methods to talk about all of them, but the method split(pattern) is just too important. Imagine that you have a huge text as a single String object and you would like your program to walk through every line. The split method allows you to split a single object into the array of lines.
String[] lines = text.split(System.lineSeparator()); // For Windows it is \r\n (two characters) and for Unix it is \n (just one character)

Or consider another scenario. Your program should walk through a web page with many tables. In HTML the table has its specific table tags: the table beginning and the table end patterns. The program can split this HTML text into the parts, where each part has just one table.

String[] tableParts = html.split(tableEndPattern);


Assignments:
1. In Eclipse modify the class Stringer in the following way:
Rename the main() method into the testStringMethods() with no arguments.
This will look like:
<br/>  public static void testStringMethods() {
<br/>     // place inside this method all one line samples of what can be done with the String
<br/>     // for example, using toLowerCase(); 
<br/>     String s = "Internet".toLowerCase(); // the result is internet
<br/>     // after each sample like that display the result
<br/>     System.out.println(s);
<br/>     // more samples like that
<br/>
<br/>  }
<br/>

Read the Nasdaq web page - view source and store the source in a text file.
The path to the file: "c:/its-resources/Nasdaq.txt"
Open this file in a Notepad and check for several stock index names, like "FB", "NQGM", etc.

Create another method: public static void lookForNasdaqMainIndex(String indexName).
In this method provide a header with your plan; the plan below is based on analysis of the source of the Nasdaq page.

Have a suggestion? - shoot an email
Looking for something special? - Talk to me
Read: IT of the future: AI and Semantic Cloud Architecture | Fixing Education
Do you want to move from theory to practice and become a magician? Learn and work with us at Internet Technology University (ITU) - JavaSchool.com.

Technology that we offer and How this works: English | Spanish | Russian | French

Internet Technology University | JavaSchool.com | Copyrights © Since 1997 | All Rights Reserved
Patents: US10956676, US7032006, US7774751, US7966093, US8051026, US8863234
Including conversational semantic decision support systems (CSDS) and bringing us closer to The message from 2040
Privacy Policy