c# - Parsing HTML to get script variable value -
i'm trying find method of accessing data between tags returned server making http requests to. document has multiple tags, 1 of tags has javascript code between it, rest included files. want accesses code between script tag.
an example of code is:
<html> // html <script> var spect = [['temper', 'init', []], ['fw\/lib', 'init', [{staticroot: '//site.com/js/'}]], ["cap","dm",[{"tackmod":"profile","xmod":"timed"}]]]; </script> // more html </html>
i'm looking ideal way grab data between 'spect' , parse it. there space between 'spect' , '=' , there isn't. no idea why, have no control on server.
i know question may have been asked, responses suggest using htmlagilitypack, , i'd rather avoid using library task need javascript dom once.
very simple example of how easy using htmlagilitypack , jurassic library evaluate result:
var html = @"<html> // html <script> var spect = [['temper', 'init', []], ['fw\/lib', 'init', [{staticroot: '//site.com/js/'}]], [""cap"",""dm"",[{""tackmod"":""profile"",""xmod"":""timed""}]]]; </script> // more html </html>"; // grab content of first script element htmlagilitypack.htmldocument doc = new htmlagilitypack.htmldocument(); doc.loadhtml(html); var script = doc.documentnode.descendants() .where(n => n.name == "script") .first().innertext; // return data of spect , stringify proper json object var engine = new jurassic.scriptengine(); var result = engine.evaluate("(function() { " + script + " return spect; })()"); var json = jsonobject.stringify(engine, result); console.writeline(json); console.readkey();
output:
[["temper","init",[]],["fw/lib","init",[{"staticroot":"//site.com/js/"}]],["cap","dm",[{"tackmod":"profile","xmod":"timed"}]]]
note: not accounting errors or else, merely serves example of how grab script , evaluate value of spect.
there few other libraries executing/evaluating javascript well.
Comments
Post a Comment