Building a page parser with php. want to use some jquery/ajax -

allright guys! have been searching, , have troubles finding solution problem. , in advance, sorry bad english.

im building small parser news articles 1 specific news site. , want code prepared add other news pages well, thats why is.

i want page reload content without refreshing page. , know takes while retrieve content selected url. thats why want add progressbar jqueryui (i know allot ask for). progressbar optional.

and im using simple html dom parser

<?php //page load time $starttime = explode(' ', microtime()); $starttime = $starttime[1] + $starttime[0];  ?> <!doctype html public "-//w3c//dtd html 4.01 transitional//en" "http://www.w3.org/tr/html4/loose.dtd"> <html> <head> <meta http-equiv="content-type" content="text/html;charset=utf-8"/> <title>svd parser</title> <link rel="shortcut icon" href="favicon.ico" type="image/x-icon"/> <link rel="stylesheet" type="text/css" href="style.css"/> <script type="text/javascript" src="jquery-1.10.2.min.js"></script> </head> <body>     <div class="container">     <div id="head">     <h1>svd parser</h1>     <hr>     <form action="index.php" method="post">     <input type="text" name="s" placeholder="enter url start svd parser" style="width: 495px;">     <input type="submit" value="svd parser it">     </form>   <?php  if (isset($_post["s"]) && trim($_post["s"]) !="") {  //what domain? preg_match('@^(?:http://)?([^/]+)@i',$_post["s"], $matches); $host = $matches[1]; // last 2 segments of host name preg_match('/[^.]+\.[^.]+$/', $host, $matches); echo "<b>domain name is: {$matches[0]}.</b><br>\n";    function checkdomaingetrightvalues($domain) {     if ($domain == "svd.se") {         $h1="h1";         $page="p[class=preamble], div[class=articletext]";         return array('h1'=> $h1,'searchparse' => $page);     }else {         return null;     } }   include('simple_html_dom.php'); $html = new simple_html_dom(); $ids=checkdomaingetrightvalues($matches[0]);  //get page $html = file_get_html($_post['s']);  // find h1  $ret = $html->find($ids['h1']);  //strip h1 of html tags (a href) add h1 tags echo "<h1>" . strip_tags($ret[0]) . "</h1>";  //find actual article , forget else //function extraction right parse lines //$values= checkdomaingetrightvalues($matches[0]); $ret = $html->find($ids['searchparse']);  //prints article out html tags, <p> can read //print first part of article hint echo "<p><b>". strip_tags($ret[0]) ."</b></p>";   //here actuall article $a=html_entity_decode($ret[1]); echo strip_tags($a, '<p>'); $html->clear();  unset($html);    }else{     echo "you need write whole article url<br>"; }   //page load time $mtime = explode(' ', microtime()); $totaltime = $mtime[0] + $mtime[1] - $starttime; printf('page loaded in %.3f seconds.', $totaltime);  ?>  </div> <div id="sidebar"> <b>svd </div>  </div> </body> </html>

i appreciate if @ least point me in right direction!

Search This Blog

Bradly

Building a page parser with php. want to use some jquery/ajax -

Comments

Post a Comment

Popular posts from this blog

java.util.scanner - How to read and add only numbers to array from a text file -

What is the end of string notation in python -

php - Add the correct number of days for each month -