Skip to main content

HackRice 7.5: How "uFilter" was born

I have a thing for Hackathon. I am a procrastinator. A lazy and procrastinator graduate student, not a nice combination to have. But still when I see hundreds of sharp minds in a room scrabbling over idea, hungry to build and prototype their idea. Bring it to life, it finally pushes me to activity, makes me productive. 
That is why I love Hackathon, that is why I love HackRice, our resident Hackathon of Rice University.

TL;DR: if you just want to try the extension, chrome version is here and Firefox version is here.

I have been participating at HackRice since 2014, when I think for the first time it was open for non-rice students, and have been participating ever since. What a roller coaster ride it has been, but that is a story for another day.
HackRice 7.5 being the last one I will be able to attend at Rice, it was somewhat special and emotional for me.

Hackrice 7.5 starts now!
HackRice 7.5 was a tad different form the other iterations. For starters it was the first time it was being held in Spring semester, and hence on a smaller scale and only to Rice Students. And also instead of normal 26 hours, it was exactly 24 hours. The venue was Liu Idea Lab. I have never been to the lab before, and it seemed to be a nice place to sit and work. The event started on Friday evening and ended on Saturday evening.
The event had two tracks, with a beginner and a Data Science track. The organizers had two in depth workshop/tutorials set up for both of these tracks to help out starters. Which I though was really cool. Even though I was brainstorming and prototyping on something different, I sat through them anyway and felt they were really thorough.

Being a one person team, and not really knowing anybody else I decided to work on a relatively smaller project which I can finish instead of trying anything in Data Science track. The idea I initially had was of a privacy filter. After some more brainstorming realized to properly make one, taking account of all anonymizing factors it probably will take me more time than 24 hours. I decided to settle on more of a toxic/malicious/sanity/trigger word filter. 

The Idea: Create a browser based extension that can filter out abusive posts, word, sentences paragraphs.

Inspiration: Lately a lot fo us have started noticing the rise of cyber bullying and abusive behaviors across the internet. Be that reddit or facebook group. Often I see it gets me rallied up just before I goto sleep. Often I wish if only I did not read that. Recent increase in cyber bullying is one of the primary reason for the tool. Mental health and online harassment are major, relevant issues today in our current society. Everyone should be able to access content in the internet without fearing for trigger words or harassment. And that goes specially for the people who have been victim of such incidents and really doesn't wish to see any such trigger words.

What is uFilter:
uFilter is a smart web extension made to help people browse the web without seeing content they don't like to see. Bringing the power to choose what to see back to users. The user has a list of buttons as filters they can choose. Either individual or more than one at a go. The process is simple and subtle: check off the type of content you want to avoid and let us handle the rest! Questionable content is blurred out, if you wish to see it nonetheless you can click to reveal the text.

You can see it in action here:

What it does in the background: The contents are blocked at page load, so the user is still able to access the context of the site before making up their mind if they are staying or leaving. The extension has s simple UI which lets them choose what to block and what not.

They can also click on the covered sections to reveal as they go. The script searches through the entire DOM looking for elements wherever they may be on the page. Sentiment analysis was implemented to determine what content was malicious.
The script also observes the page so it can adaptively block content on pages like Youtube loading comments, Facebook feed as well as Twitter pages. 
uFilter is not just a dumb keyword filter. It first combs every web page you visit for questionable content based on your filter selection, once it identifies sentences containing questionable content and uses the AFINN-165 wordlist and Emoji Sentiment Ranking to perform sentiment analysis. Once it determines it has abusive content. It blurs out only that portion.

The most useful part of uFilter is, it can observe dynamic webpages and works on texts which are dynamically loaded into the webpage. Hence it works for twitter or Facebook with rolling feed and dynamic texts as well. 
Another distinguishing feature of uFilter is, it does not remove/replace any content. If a user decides from the context of the page that s/he wants to read the content, just clicking on the blurry portion will reveal the text to the user.

All this is done in realtime so the user does not notice any difference in their normal browsing behavior. But of course properly identifying abusive content just programmatically is a hard problem. Recognizing that uFilter gives the user an option to tag/mark/categorize text as offensive. Once a user does that, the filter will learn from it. This information is stored in a firebase datastore without any identifying information and helps uFilter.

The end result is a uFilter which can intelligently sanitize any website or webpage you visit of any abusive content you do not wish to see. You can see it in action

Got this beautiful earphone as prize
Coming back to Hackrice. I really did not expect much when I submitted it for judging. I had just barely made a working prototype and published it in Chrome Web Store. It was working in Firefox but it still had some security problems for which Firefox was not publishing the add-on. Surprisingly the judges including Dr. Wang was really interested in the idea and specially the implementation. When the time came for deciding winners it was announced that uFilter won the first prize! Imagine my delight!

If you want to know more about the project, visit the submission page at devpost.
If you want to try the add-on, I will be delighted to hear your feedback!

Chrome version download link!
Firefox version download link!

The loot :D A QC30 and Solar Backpack

Popular posts from this blog

Visualizing large scale Uber Movement Data

Last month one of my acquaintances in LinkedIn pointed me to a very interesting dataset. Uber's Movement Dataset. It was fascinating to explore their awesome GUI and to play with the data. However, their UI for exploring the dataset leaves much more to be desired, especially the fact that we always have to specify source and destination to get relevant data and can't play with the whole dataset. Another limitation also was, the dataset doesn't include any time component. Which immediately threw out a lot of things I wanted to explore. When I started looking out if there is another publicly available dataset, I found one at Kaggle. And then quite a few more at Kaggle. But none of them seemed official, and then I found one released by NYC - TLC which looked pretty official and I was hooked.
To explore the data I wanted to try out OmniSci. I recently saw a video of a talk at jupytercon by Randy Zwitch where he goes through a demo of exploring an NYC Cab dataset using OmniSci. A…

FirefoxOS, A keyboard and prediction: Story of my first contribution

Returning to my cubical holding a hot cup of coffee and with a head loaded with frustration and panic over a system codebase that I managed to break with no sufficient time to fix it before the next morning. 

This was at IBM, New York where I was interning and working on the TJ Watson project. I returned back to my desk, turned on my dual monitors, started reading some blogs and engaging on Mozilla IRC (a new found and pretty short lived hobby). Just a few days before that, FirefoxOS was launched in India in the form of an Intex phone with a $35 price tag. It was making waves all around, because of its hefty price and poor performance . The OS struggle was showing up in the super low cost hardware. I was personally furious about some of the shortcomings, primarily the keyboard which at that time didn’t support prediction in any language other than English and also did not learn new words. Coincidentally, I came upon Dietrich Ayala in the FirefoxOS IRC channel, who at that time was a P…