thedatadev.com

Hello, I'm
Edrian Gomez.

A software creative & ML engineer based in Sydney who likes to eat Tonkotsu ramen 🍜 shoot fadeaway jump shots 🏀 bust a move 🕺🏽 watch Japanese game shows 📺 travel abroad ✈️ get sh*t done 👔 spend 3 hours at the character customisation screen 🎮 solve important problems 🔎 listen to city pop 🎧 explore Daiso 🏪 get creative 🎨

Frontend cards

Learn how to develop websites by first dissecting UIs into smaller components. For example, a typical eCommerce website could contain a header, catalog, and a footer. Furthermore, the header could contain a home button, a search bar, and a set of navigation links. This project aims to help beginners learn web-development using a bottom-up approach i.e. by learning how to implement the smallest components in a UI first.Read more

View a live demo

Typical user story. The user browses a catalog of UIs and picks one that interests them. They are then taken to a screen that shows a break-down of what subcomponents the chosen UI is comprised of. If we visualise this break-down as a tree of components, the user starts at the leaf nodes and eventually develops the techniques necessary to implement the chosen UI as a whole i.e. once they reach the "root node."

Technical challenges. Each UI on the website needs to have screenshots for all its components, its component's components, its component's component's components... and so on. To automate this process I set a "data-" attribute on every component I want a screenshot for, and run a custom TestCafe script to take PNG screenshots of every component across various screen sizes. Once the user starts learning how to code a particular component, they are shown a simple text editor (based on the Monaco editor) as well as a preview panel. The preview panel shows a realtime preview of their code implementation. Once they are happy with their implementation, they can evaluate their work by running a visual regression operation (based on pixelmatch) that compares actual and expected screenshots.

Benefits of this approach. There is a lot of value in presenting learning material in bite-sized chunks. It makes learning less intimidating, it's easier to stay motivated and there is less friction to actually start coding if the goal is small and won't take long. Furthermore, the web is saturated with videos and articles for learning web development which lends itself to a lot of passive learning. Coding is a practical discipline, and this project has a large emphasis on being hands-on.

Language learning recommender

Language learning is a costly activity, both in terms of time and money. This project aims to provide language learners an efficient way to find online source material written in their target language. The goal is to use machine learning models to classify online documents and videos into predefined comprehension levels.Read more

Emulating immersion. The most effective way to learn a language is to immerse yourself in that culture, which people commonly do by studying abroad. However, not everyone has the means to do this. One alternative is to expose one's self to online articles and videos, but these online resources require different levels of comprehension and language proficiency. This project involves creating a recommender system that presents learners with articles and videos suited to their language level.

Using machine learning. Natural language processing is used in a couple of ways. It's firstly used in an unsupervised learning setting in order to perform data exploration to uncover "latent language levels" within a dataset i.e. finding clusters of articles that should fall under the same language level. It's also used in a supervised learning setting whereby an established langiage proficiency grading system e.g. JLPT is used to classify articles into levels N1 to N5.

Datasets. I've decided to start off with datasets containing Japanese texts, just because I myself am self-studying Japanese. Using Pushshift, I am pulling data from a subreddit that posts links to NHK Web Easy articles. These articles are simplified Japanese news articles targeted at Japanese youth and Japanese language learners. Each regular news article on NHK has a corresponding simplified version on NHK Web Easy. I pull both versions, and index them using a "Trie" data structure for retrieval purposes. I am currently trying to find a suitable video-based dataset which I plan to convert to text and index in the same way I do with the NHK articles.

isleVR

A mobile browser-based editor which allows you to drag and drop 3D models to create your own VR spaces. Each space is a small, self-contained context, hence the name 'isle'. It is also an acronym which stands for 'immersive secondary language education'. My aim is to build a 'better flash card system' for learning languages through immersion. Features in the works include voice-capture for spoken practice and cross-platform, multi-player interaction.Read more

Contextual-based learning. Vocabulary and expressions are better learned in context. Trying to memorize a thousand words using flash cards doesn't allow one to understand relations between words and the concepts behind them.

Virtual reality. Through virtual reality, we can create a controlled learning environment that can be modelled to simulate various situations and environments that occur in real life. In this VR space, a learner can interact with objects or NPCs in the language they are learning.

Implementation. The current MVP is implemented with three.js and only works on mobile devices and browsers that support Web XR. The 3D models shown in the demo are basic assets I created using Blender. Currently, there are smartphone devices that have the ability to perform photogrammetry which is something I plan to explore (since creating my own assets isn't scalable). Ideally, I would like to re-build this project using Unity since it also has an option to export C# projects as a web app using WebAssembly.

B&B Chatbot

Replaces traditional forms with a conversational interface to facilitate search queries. Demonstrates the strengths of current state-of-the-art chatbot technology. This is contrasted with traditional web search forms which are structured differently across all sites. Instead, having a consistent interface across all sites improves user experience through free-form queries making it faster for users to find what they need.Read more

Rationale. Traditional search forms vary from webpage to webpage and come in different structures. What if we could standardize the way search forms are implemented to make them webpage-agnostic whilst improving user experience? People are already used to using messenger apps, so by taking a familiar UI design we reduce friction and cognitive load involved in filling out forms.

Implementation. Search queries are sent to Dialogflow which uses domain knowledge to parse a query and return the query's parameters (e.g. price, location, length of stay) in a structured format using named-entity recognition. These parameters are then used to query a relational database and the query results are filtered, sorted and scored by a custom recommender system built on Python and Pandas. These results are displayed on a simple web client built on Vue.js and Vuex.