Bit more complicated PHP question

Porsche_maniak · May 27, 2010

Hi guys !
I am not so good with php as you previously may know...
So i can't achieve the following and i hope you can help me with some script.

I want to get the contents from .gz file (which is a txt file) and search in this .txt file for match with '[url=http://rapidshare , [url=http://megaupload ,[url=http://hotfile , and .torrent '.

If there is match , then show rapidshare.png for [url=http://rapidshare
If there is match , then show megaupload.png for [url=http://megaupload

and so on...

Thanks !

JmZ · May 27, 2010

Code:

$file = gzfile('yourfile.gz');
$file = implode("\n",$file);

$matches = array(
    array('rapidshare.png', '[url=http://rapidshare'),
    array('megaupload.png', '[url=http://megaupload')
);

foreach($matches as $match) {
    if(strpos($file, $match[1]) !== false) {
        $image = $match[0];
        break;
    }
}

Something like this maybe.

This is assuming your gz file is purely a gzipped text file, not a tar.gz for example.

Porsche_maniak · May 27, 2010

thanks jmz but i have many .gz archives.

litewarez · May 27, 2010

Firstly you need to break it up.

Archive Opening (extraction into memory) | gzopen
Parsing the contents (Extract links from tags) | preg_match_all
Compile the output.

So i would start by making a base class to work with

PHP:

class GzipDownloads
{
   protected $gzfile = null;
   protected $gzcontents = null;

   function __construct($file)
   {
        if(!file_exists($file))
       {
           trigger_error('Unable to open ' . $file,E_USER_ERROR); //Die here
       }
       $this->gzfile = gzopen($file);
       $this->gzcontents = gzread($this->gzfile,filesize($file));
   }

   function getMeta()
   {
      preg_match_all('/\[url\=(.*?).*?\].*?\[.*?\]/is',$this->gzcontent,$matches);
      $mata = array();
      foreach($matches as $match)
      {
          //Url segment should be [1]
          if(preg_match('/http:\/\//',$match[1]))
          {
               $usegments = parse_url($match[1]);
               if($usegment['host'])
               {
                   $host = str_replace(array('.com','.net','.co.uk','.org'),'',$usegment['host']); //Remove tld
                   $meta[$host] = true;
               }
          }
       }
       return $mata;
   }
}

Before i can build the getMeta() method witch will hold the links and other statuses, i need to examine the contents of the text file to look for similarities

-- Updated CODE AND Read below

PHP:

$gzd = new GzipDownloads('my.file.txt.gz');
$meta = $gzd->getMeta();
foreach($meta as $host)
{
    if(file_exists('images/' . $host . '.png'))
    {
        //images/rapidshare.png exists so show it here!
        //$host will be the domain name without any tld so rapidhsare links will be rapidshare, hotfile.com/../../../ will be hotfile
    }
}

The above is all example and untested.

JmZ · May 27, 2010

litewarez said:
Firstly you need to break it up.

Archive Opening (extraction into memory) | gzopen

Parsing the contents (Extract links from tags) | preg_match_all

Compile the output.

So i would start by making a base class to work with
Before i can build the getMeta() method witch will hold the links and other statuses, i need to examine the contents of the text file to look for similarities

He doesn't need regex, just to check if it contains rs, mu or whatever else.
Also, gzfile() should be fine for any gzipped text file.

Porsche_maniak: the code I posted will read one gzipped text file and check which hosts it contains. This seems like what you are asking for.

If you wish to handle multiple files, simply read each one in a loop or use a class and have an object per file.

Regardless of how you read it in though, that is how you should match certain strings.

litewarez · May 27, 2010

Update my post, please check!

Edit:
I really don't know why people run from PCRE :/ puzzles me, i mean PHP /mysql can handle 100s / 1000s of queries per second yet people try so hard to get them down to like 3 and that :/

JmZ · May 27, 2010

Again, im not trying to say your code is wrong lite, but regex is useless here.

He needs to check for matches, he doesn't need the matches.
Strpos is much, much faster.

Also, since all he wants is to read the files in and check for string matches, such a complex object won't be needed.

litewarez · May 27, 2010

Yea i know dood, just saying using my code you only need to have an image in the folder and it will match for that aswell.

so if a new host came out called terahost.com, he just needs to add the terahost.png to the dir and its found,

JmZ · May 27, 2010

Yes but the point is, it is excessive and slower than it could be.

All you need is to read each gz in, check for matches with strpos and use the correct image. It is a very, very simple problem with an equally simple solution.

Porsche_maniak · May 27, 2010

litewarez really tnx for your effort,but i think that JmZ is right...

@ JmZ
how do i read each one in a loop ?

@litewarez
Hmm sounds interesting...

JmZ · May 27, 2010

Do you mean one gz contains many files? or may gzips contain one file each?

If they each have one file, you just loop through them or stick them in an object.

for example, off the top of my head:

PHP:

$files = array('some.gz', 'file.gz', 'made.gz', 'up.gz');

function hostImage($filename) {
    $file = gzfile($filename);
    $file = implode("\n",$file);

    $matches = array(
        array('rapidshare.png', '[url=http://rapidshare'),
        array('megaupload.png', '[url=http://megaupload')
    );

    $image = 'none.png';

    foreach($matches as $match) {
        if(strpos($file, $match[1]) !== false) {
            $image = $match[0];
            break;
        }
    }

    return $image;
}

foreach($files as $file) {
     $image = hostImage($file);
     // do other things here
}

It'd be nicer class based now I believe, because having the array of strings to match inside the function isn't ideal.

Also, this will return 'none.png' if nothing is found.

Porsche_maniak · May 27, 2010

@JmZ

Yea .. They are arround 1300 .gz files in a folder and keep increasing . Each .gz contains only 1 .txt file . I have a path where the .gz files are - $pathz = CONTENT_DIR.$y.'/'.$m;
The path is taking 10 .gz files.When user click the next page it takes the next 10 .gz.

I hope i haven't confused you.

JmZ · May 27, 2010

Yeah I see.

Well you'd want to find all gz files in the dir first, then loop through like above.

e.g.

change $files at the top of the code, to:

PHP:

$current_dir = getcwd();
chdir($pathz);
$files = glob('*.gz');
chdir($current_dir);

CyberHacK · May 27, 2010

Coding Geek battle =/

Porsche_maniak · May 27, 2010

It is content/10/05
I tried

Code:

$pathz = CONTENT_DIR.$y.'/'.$m.'/'.$glo;

 $glo=glob('*.gz');

 
 echo $pathz;

To see if i am going to take the .gz file names but it was again content/10/05/

litewarez · May 27, 2010

PHP:

/*
    *Functions
*/
function hostImage($filename) {
    $file = gzfile($filename);
    $file = implode("\n",$file);

    $matches = array(
        array('rapidshare.png', '[url=http://rapidshare'),
        array('megaupload.png', '[url=http://megaupload')
    );

    $image = 'none.png';

    foreach($matches as $match) {
        if(strpos($file, $match[1]) !== false) {
            $image = $match[0];
            break;
        }
    }
    return $image;
}

//Ditrectory stuff
$gdir = CONTENT_DIR.$y.'/'.$m.'/*.gz';

foreach(glob($gdir) as $file)
{
    $image = hostImage(CONTENT_DIR.$y.'/'.$m.'/' . $file);
    echo $image; // rapidshare.png or megaupload.png
}

Porsche_maniak · May 27, 2010

Warning: gzfile(content/10/05/content/10/05/entry100501-210451.txt.gz) [function.gzfile]: failed to open stream: No such file or directory

the bolded shouldnt exist..

Lock Down · May 27, 2010

Wow what a blob of too many coders.

)

First let me say I have no idea where your code came from as I don't see content_dir being defined anywhere.
But looking at what it is producing try removing the content_dir. from the command whatever it was.
Seeing what your total code is for these routines would be nice since you are following 2 peoples advice.

litewarez · May 27, 2010

no:

Chnage

$image = hostImage(CONTENT_DIR.$y.'/'.$m.'/' . $file);

to

$image = hostImage($file);

i forgot glob returns an absolute path

Porsche_maniak · May 28, 2010

Thanks guys !
I managed to get it working...

Bit more complicated PHP question

Active Member

(╯°□°）╯︵ ┻━┻

Active Member

Active Member

(╯°□°）╯︵ ┻━┻

Active Member

(╯°□°）╯︵ ┻━┻

Active Member

(╯°□°）╯︵ ┻━┻

Active Member

(╯°□°）╯︵ ┻━┻

Active Member

(╯°□°）╯︵ ┻━┻

Banned

Active Member

Active Member

Active Member

Active Member

Active Member

Active Member