Hi there !
I need to get data from my other 3 website pages. Apart from several popular commercial COM components that retrieve data from other web sites via HTTP requests, such as AspTear and AspHTTP, I am using XMLHTPP object. I am facing few problem in handling the stuffs.. so i keen to know the following things..
1) Is it possible to extract few parts of a web page not as whole ?
2) How one can run all the URL's of extracted page into our webpage under our header or footer ?
3) Is it possible to specify the height, width of extracted page ? I mean to say.. is it possible to view the page under predifined height, width of table ? If yes then how ?
I need to view the pages under one square whose height and width is 300 and 300 respectively.
it would be great help, if any helping soul can put some light one the above mentioned issue.
1. If you are ripping stuff from another site then the answer is basically no. You would need to set something up with that site owner to give you a cut down version. You could of course parse everything when it comes back and try and grab just the bit you want out of the lot.
2. Parse the returned content to locate the URLs. Put them in an array, loop through and process.
3. Probably not. You would need to be sure of the content coming down. You could do something with frames but it would probably screw up 50% of the time.
It sounds to me like you are harvesting content from these sites without any permission to do so. If this is the case you deserve all the problems you get. If this is not the case then have a talk to the providers of the information and get them to help you define a structure/method for the format and retrieval of said data.
Above all, Thanks for your response !. Let me take one issue at a time.
1) Yes, I am ripping stuffs but not from someone else website. I am ripping stuffs from the site which i made for my University when was doing Higher secondry. I am still the webmaster of all the three website. On doing, R n D on internet, i found the component but the problem is it gives me full page of that website not the exact things which i want.. and Morever, i don't know how one can grab the part which need and left the remaining stuffs.
I can easily paste the code of the respective website on this new one but it's exactly I not want to do... All i want is.. If i update one of the website, then respective content will update automatically on this new website..
2) Please eloborate the former part of your suggestion i.e . Parse the returned content to locate the URLs.. I am not getting clearly.
3) As i said above, everytime, getting the full page when using the component. I even trying to extract the page in predefined hight and width of table.. but not succeed so far.. it always comes in 800x600 resolution. Is it only possible with IFRAMES ??
Well.. Rock, few times back i was trying to harvest the contents of msn webpage (specially its weather / Stocks ticker) but not succeeded as of now. I need that weather ticker for my university webpage...
the best way to to do this entire thing IMO is to start using XML.
Instead of requesting the entire page you would make a request that use returns the "data" of the page... but I think this is more work then you want to do and in areas you don't really want to do it in.
As for parsing the HTML, is it HTML or XHTML?? If it was XHTML you sould parse it using the XML parser. I suspect it is straight HTML so you would have to write something to locate the URL's within the HTML and then extract the target/source of that url. For example say one of the URL's is in a hyper link. You know that the strucutre in HTML for a hyper link is for it to start with "<a " so you go through your HTML (which is just a string afterall) and look for all occurances of this. You know that the anchor tag will end with ">" so you then start looking for that. Once you have found the start and finish of your tag you then parse the content in the same way. Does that make more sense?? It's pretty messy really, but if you are going to be dealing with HTML as your source data you don't have a lot of choice.
Regarding the width/sizing issue... the problem is you can never be sure of what you are going to get. If you tried to use a table but you got an image in your request that is wider then your table your table will stretch to accomdate it. Frames won't stretch (unless you allow resize and then it's manual stretching) but you can let them scroll which means all content is still visible. Hopefully that make sense...
The task you have set yourself is in theory not that hard, but in practise a real pain in the backside... Requesting, parsing and displaying data are three of the easiest things you can do,... the fact that you are working with straight HTML makes things messy though.
I got your points, which was pretty well explained. The three website of which 2 made in HTML and one in Php and Sql Server 2000.
I thought to learn XML after my university exams but now have just bit of idea bout it. Let me know.. how my life would be easy if the respective page is in XHTML not straight forward HTML.
how i would make a request that returns the "data" which i want Instead of requesting the entire page ?
Earlier, i thought its a pretty easy job but when into it.. my perception went totally wrong. I was under impression that it works like free weather ticker(URL) which anyone can use anywhere in their website..
The thing with the ticker is that it is relatively small in nature so it's not a big problem. You are requesting entire pages.
XHTML is HTML that works to the XML standards... so all tags are closed, structures are slightly different attricutes are XML compatible. Basically it means you can treat the page as if it was XML and you wouldn't have to parse the page manually but could use the XML Parser and all of it's prebuilt functions.
With returning the data instead of the entire page I don't think you can do it from what you have said. Am I correct in thinking that the entire page is generated from a single call to a component? If that's the case then you are stuck.
In the case where I have do this I have a component that returns the data for the page in XML format and the page then overlays the XML with an XSL (xml stylesheet) which "presents" the data in html format.
If you were doing something similar, you could add a check for a query string which, if found, could disable the overlaying of the XSL and just return the XML. Then you could use your own XSL to present it the way you want it.
Rock, I am really sorry.... I really not want to do so..
Actaully, i was thinking to try the component once again with search strings.. So keen to know your opinion.. Yet not, aware whether i can pass the strings with that component or not.. let me read once again the online manual and get back to you for your support..
I devoted 2 full hrs in the hope to gets something which can sort out the issue i have but result was not fruitful. I can't find anything related to that so far...
I have an idea which i keen to share with you but it also attached with one big hurdle.. Anyway let me share the idea first.....I was thinking to send the request and get the result as text not as html.. amd then write it to a file and then Search the desired the strings and scarp that part of content... But how i can do the same thing dynamically ???????? There i stuck again..