It took me about a week of my Dashain vacation to make this simple captcha breaker (5 days to be exact). I had been reading Image Processing in my CS undergrad for some time then, and wanted to put my skills to use, although, I must admit, its not as overly complicated or full of image processing bits as I’d have likely enjoyed. I had only learned the basics when I started doing this. If I were to do it now, I would have definitely done it differently.
Neural Network was one of the electives available during my 5th semester, and knowing that it was available, I was certain that I wanted to study it over any other electives. But, unfortunately (or fortunately), my college, as it is with many other colleges of Nepal, made it a compulsion to take Cryptography as the elective subject. I, obviously, wanted to study Neural Net and could not convince the college on teaching ANN instead of Cryptography. I was left with no choice but to study both, which I did. A few of us studied both the subjects as an elective. Since the teacher was not available during the weekdays, we took classes on every weekend for 3 hours each. So, there were no holidays for us, for 4 months, I think.
I used this dataset while doing a project for my undergrad coursework of Neural Network. We had to implement a handwritten digit recognition neural net using the MNIST dataset. Upon accomplishing it, I looked around for Devanagari dataset and found one located at CVResearchNepal.com, which seems to have expired as of this moment.
Packt Publishing have a lot of good premium books and everyday they put up a premium book for download for free. You could get a premium book a day by going to the Free Learning eBook page. I chose to automate the process and have the book grabbed automatically to my PacktPub account everyday. I have made the script available on Github for anyone who is interested to do the same: PacktPub Grabber.
I put together this script called csvify_fortinet_logs in order to convert the space separated format of Fortinet Router’s Fortiguard WebFilterLog to a more widely used and favorable for analysis format, CSV. I wanted to analyse the web filter log of the router, but could not use it as an input to pandas, so had to code this.
I am kind of changing the way I do things here on the blog, by introducing new categories, and by maybe letting go some of the categories. Its been over 5 years, and I feel that I need to make some changes now.
SORRY THIS NO LONGER WORKS. Import.io REMOVED THEIR FREE PLAN AND I JUST COULDN’T PAY TO KEEP THIS RUNNING.
Hi guys, its been a long time. I am tired of making empty promises (regarding posting regularly) so, let me not do it again. While I was beginning my data-science journey, I tried to collect as many sources for Nepali datasets as possible, and the following is the listing of the same. The problem is most of these datasets are in PDF (most are in a booklet), so you’d have to use some extraction utilities such as Tabula to convert it into a CSV or workable file format.
Hi guys, it’s been a while. I have been busy with my studies. I will try my best to continue updating the blog regularly (i.e at least 2-3 posts a week) but please don’t hold me to it.
You can make use of YouTube API to grab details of YouTube videos. Although, more complex script requires you to setup authentication using OAuth, it is not required for simple tasks, such as collecting views of few videos.
Update: Google recently removed the extension saying it has malware. I sold the extension a few months ago, and the new author seems to have put ads in place. I have re-uploaded the extension, so please use the new one. It is free from any ads whatsoever. Sorry for the trouble.
Bing Translator automatically translates non-English posts on Facebook and Twitter. As much as I’d like to appreciate the feature, it is very unreliable service. I’m not sure about you guys, but I have never seen Bing translate even one word of my language (Nepali) correctly. I’d rather not see incorrect translations on my news feed.