parsing - Parse HTML and Get All h3's After an h2 Before the Next h2 Using PHP -


i looking find first h2 in article. once found, h3's until next h2 found. rinse , repeat until headings , subheadings have been located.

before flag or close question duplicate parsing question, please take note of question title, isn't basic node retrieval. i've got part down.

i using domdocument parse html using domdocument::loadhtml(), domdocument::getelementsbytagname() , domdocument::savehtml() retrieve important headings of article.

my code follows:

$matches = array(); $dom = new domdocument; $dom->loadhtml($content); foreach($dom->getelementsbytagname('h2') $node) {     $matches['heading-two'][] = $dom->savehtml($node); } foreach($dom->getelementsbytagname('h3') $node) {     $matches['heading-three'][] = $dom->savehtml($node); } if($matches){     $this->key_points = $matches; } 

which gives me output of like:

array(     'heading-two' => array(         '<h2>here first heading two</h2>',         '<h2>here second heading two</h2>'     ),     'heading-three' => array(         '<h3>here first h3</h3>',         '<h3>here second h3</h3>',         '<h3>here third h3</h3>',         '<h3>here fourth h3</h3>',     ) ); 

i'm looking have more like:

array(     '<h2>here first heading two</h2>' => array(         '<h3>here h3 under first h2</h3>',         '<h3>here h3 found under first h2, after first h3</h3>'     ),     '<h2>here second heading two</h2>' => array(         '<h3>here h3 under second h2</h3>',         '<h3>here h3 found under second h2, after first h3</h3>'     ) ); 

i'm not looking code completion (if feel better others doing -- go ahead), more or less guidance or advice in right direction accomplish nested array directly above above.

i assume headings on same level in dom, every h3 sibling of h2. assumption , can iterate on siblings of h2 until next h2 encountered:

foreach($dom->getelementsbytagname('h2') $node) {     $key = $dom->savehtml($node);     $matches[$key] = array();     while(($node = $node->nextsibling) && $node->nodename !== 'h2') {         if($node->nodename == 'h3') {             $matches[$key][] = $dom->savehtml($node);            }     } } 

Comments

Popular posts from this blog

css - Which browser returns the correct result for getBoundingClientRect of an SVG element? -

gcc - Calling fftR4() in c from assembly -

Function that returns a formatted array in VBA -