i have huge .csv file following headers:


embedded in url requests youtube video id identifiers needs extracted.


i can achieve in bash in ruby (still learning).

cut -d , -f 2 urls.csv | grep watch?v= 


"" "" "" "" "" 

the youtube video id identifiers 11 characters after watch?= upto first &



require 'csv' require 'addressable/uri'  #read lines csv, headers on lines = csv.readlines("test.csv", :headers=>true)  #print csv column headers 'date , time , 'url' #p lines ['date , time'] #p lines['url'] #timestamp = lines ['date , time'] urls = lines['url']  # each line (url) query value urls.each |url|   v = addressable::uri.parse(url).query_values["v"]   if (v)      puts v # prints value if found   end end 

the code above output video id contained in requests, not watch?v= there lots of duplicates.

how make output video has prefix of watch?v=? (with timestamp , ip). indicates video has been played. thanks.

the support slicing , dicing uri limited in ruby's core uri class. other option addressable/uri.

require 'addressable/uri' uri=addressable::uri.parse('') uri.query_values["v"] #query_values returns key-value pairs of query components => "nlih9ca-ftg" 

here's snippet

urls=["", "", "", "", ""]  urls.each |url|   v = addressable::uri.parse(url).query_values["v"]   puts v end 


chzen7tmzja wavl_ijv5ei 8t2s9hsrkl8 ssdqcluh00c nlih9ca-ftg 

you can addressable/uri sudo gem install addressable


