api ai tutorial

Posts

Showing posts from June, 2018

Getting personal channel Id after oAuth

https://developers.google.com/youtube/v3/docs/channels/list Choose list to my channel link to the api explorer https://developers.google.com/apis-explorer/#p/youtube/v3/youtube.channels.list?part=snippet&mine=true&_h=2&

Download entire channel youtube-dl

youtube-dl ytuser:<USER> https://askubuntu.com/questions/438376/how-to-download-all-videos-on-a-youtube-channel/376274

EXTRACT href in between "" inverted comma .. regex

modified variable Takes url which start with channel hrefsInAjaxCallPattern = 'href=(\"/channel/.*?\")' in general use findall to findall pattern in the String re.findall(pattern, string) r

extract href from response.css output

extract href from response.css output (response.css('a.yt-simple-endpoint.style-scope.ytd-grid-channel-renderer ::attr(href)' ).extract())

Understanding and using and Testing regex

Regex Understanding - https://medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285 http://www.vogella.com/tutorials/JavaRegularExpressions/article.html Testing: https://regex101.com/ Using: https://stackoverflow.com/questions/40458087/python-extract-text-from-string?rq=1 https://stackoverflow.com/questions/4666973/how-to-extract-a-substring-from-inside-a-string-in-python

start scrapy splash local server

docker run -p 8050:8050 scrapinghub/splash

scrape using scrapy and splash and execute inner javacscript , script tags

import scrapy import pickle class MySpider(scrapy.Spider): start_urls = [ "http://localhost:8050/render.html?url=https://www.youtube.com/channel/UCv1Ybb65DkQmokXqfJn0Eig/channels" ] name = "youtubesc" def start_requests( self ): for url in self .start_urls: yield scrapy.Request(url, self .parse) def parse( self , response): self .log( "this program just visited" + response.url) print ( "response" ) print (response.text) # print( response.css('a.ux-thumb-wrap.yt-uix-sessionlink .spf-link').extract()) filename = "pp.html" with open (filename, 'wb' ) as f: pickle.dump((response.body), f) # yield { # 'author_name': response.css('small.author::text').extract_first() # } We are leveraging localhost of splash bcz normal methods explained on website was not working

setup a new scrapy project from cli

scrapy startproject first_scrapy https://www.tutorialspoint.com/scrapy/scrapy_create_project.htm sample program import scrapy class firstSpider ( scrapy . Spider ): name = "first" allowed_domains = [ "dmoz.org" ] start_urls = [ "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/" , "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/" ] def parse ( self , response ): filename = response . url . split ( "/" )[- 2 ] + '.html' with open ( filename , 'wb' ) as f : f . write ( response . body ) to run the program goto project directory and type "scrapy crawl 'name'" name will be whatever is there in the program eg. here it is 'first' as name = 'first'

Elastic Search multi match query equivalent with normal query

https://stackoverflow.com/questions/25537322/find-out-which-fields-matched-in-a-multi-match-query There is another exact way to find out which field is matched in the query Because the highlight is post highlight process, it is not accurate because of the way it did Just use named query to do it instead of multi-match such as { "multi_match" : { "query" : "query phrase here", "fields" : [ "name", "tag", "categorys" ], "operator" : "AND" } translate it into bool query with name "should": [ { "match": { "name": { "query": "query phrase here", "_name":"name_field" } } },{ "match": { "tag":{ "query": "

Elastic Search highlight matched Text

GET testindividualvideos/_search { "query": { "match": { "dialog": { "query": "wise", "analyzer": "synonyms" } } }, "highlight" : { "fields" : { "dialog" : {} // will highlight all the matches in the dialog } } }

Adding wordnet synonym for elasticSearch

The setting for wordnet synonym PUT /individualvideos { "settings": { "analysis":{ "analyzer":{ "synonyms":{ "filter":[ "lowercase", "synonym_filter" ], "tokenizer": "standard" } }, "filter": { "synonym_filter": { "type": "synonym", "format" : "wordnet", "synonyms_path" : "analysis/wn_s.pl" } } } } } Query: GET allvideos/_search { "query": { "match": { "description": { "query": "tell me about relationship", "analyzer": "synonyms" } } } }

elastic search all queries

PUT twitter/_doc/1 { "user" : "kimchy", "message" : "trying out Elasticsearch having running all the way goes down sizing are is the" } PUT /my_index1 { "settings": { "analysis": { "analyzer": { "my": { "type": "standard", "stopwords": [ "is", "having" ] } } } } } GET /_search?q=having GET /_analyze { "analyzer": "standard", "text": "trying out Elasticsearch having running all the way goes down sizing are is the" } GET /_analyze?tokenizer=whitespace {"You're the 1st runner home!"} POST _analyze { "analyzer": "my_analyzer", "text": "The quick brown fox." } PUT /my_index { "mappings": { "blog": { "properti

elastic search query for synonym getting lower score in search

the first sample working query .. will be modified as per use case PUT my_index4 { "settings": { "analysis": { "filter": { "arabic_stop": { "type": "stop", "stopwords": "_arabic_" }, "arabic_keywords": { "type": "keyword_marker", "keywords": ["مثال"] }, "arabic_stemmer": { "type": "stemmer", "language": "arabic" } }, "analyzer": { "stemming_analyzer": { "tokenizer": "standard", "filter": [ "lowercase", "arabic_stop", "arabic_normalization", "arabic_keywords",

machine learning synonym

https://stackoverflow.com/questions/28305250/elasticsearch-customize-score-for-synonyms-stemming?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

node js handle , unhandled promise rejection

process . on ( 'unhandledRejection' , ( reason , p ) => { console . log ( 'Unhandled Rejection at: Promise' , p , 'reason:' , reason ); // application specific logging, throwing an error, or other logic here });

find length of object in nodejs

dont need to import anything just use npm i object-length https://www.npmjs.com/package/object-length

Non-ASCII character '\xe0' python

Non-ASCII character '\xe0' https://stackoverflow.com/questions/10589620/syntaxerror-non-ascii-character-xa3-in-file-when-function-returns-%C2%A3 # -*- coding: utf-8 -*- just add this in the pytyon code

close terminal without killing command

use nohup reference : https://www.wikitechy.com/technology/can-close-terminal-without-killing-command-running/

transliteration google api node js (eg of hindi here)

transliteration in python for indian languages espcially hindi

# -*- coding: utf-8 -*- from transliteration import getInstance t = getInstance() text = u"who are u what is this buddha" t_text = t.transliterate(text , "hi_IN" ) print t_text

Download only audio from youtube

youtube-dl ub8G69LkcOI -x --audio-format opus

run python from inside of node js

function callName(req, res) { // Use child_process.spawn method from // child_process module and assign it // to variable spawn var spawn = require( "child_process" ).spawn; // Parameters passed in spawn - // 1. type_of_script // 2. list containing Path of the script // and arguments for the script // E.g : http://localhost:3000/name?firstname=Mike&lastname=Will // so, first name = Mike and last name = Will var process = spawn( 'python' ,[ "./hello.py" , req.query.firstname, req.query.lastname] ); // Takes stdout data from script which executed // with arguments and send this data to res object process.stdout.on( 'data' , function(data) { res.send(data.toString()); } ) } https://www.geeksforgeeks.org/run-python-script-node-js-us