Comments in web pages

naveen

New Member
Hi guys, i am working on some project and i need to extract comments from a forum,blog web page.

The problem is the different implementation on different pages, some have div element with id="comment" or class="comment" or div inside div or some dont have any id/class with comments -- so many possibilities are there.

I need a general solution how to get all comments from any web page.

I am doing it in java and using Jsoup lib for it which is working well to extract contents, but how to identify the comment blocks with so many different possibilities. If it is not possible then there should be a standard way of writing code for comment block that will have id or class with value as comment.

Any suggestions.Thanx.
 

CaldwellYSR

Member
Are you taking these comments from pages that you own? If so why not just write them all with the same format??? If you don't own any of these pages and you're trying to take comments from them I don't know how legal (or more importantly how ethical) that is and I don't really know how to help you anyways because designers are going to use the method that fits them best not the method that helps outsiders the best???
 

naveen

New Member
I don't own the pages, am taking comments from any web page on internet. Targeting specific page would be easy but i was looking for some general solution.

I think this is allright, a lot of social media analysis is done by many organisations by reading comments and sentiments on web pages about the products and services from many social media websites. Many organisations have softwares in the market for social media analysis that require reading of comments from many websites.
I am also implementing a software for social media analysis.

Yeah you are right web designer will going to write codes according to there own convenience, but if they adopt a standard way that might be easy later,
well thankyou for your thoughts.
 
Top