hcoelho.com

my blog

Title only : Full post

Test cases, UI for affinity tool, and putting things together - A quick update

:

This is going to be just a quick update about the progress of our current project, since I don't really have any big news about it.

For the past weeks, I have been working on revamping our test suites: making sure they are still passing, making new tests for new functions that were not previously tested, and build test cases for the frontend (as described in my previous post: Making a test suite for the front-end client). No big news here, it is all working nicely!

I also made another query (an interface...) for getting information about user affinities (which I explained in this post: User Affinity Tool: grouping and finding patterns for users). Now people who are using Rutilus can have a nice interface for querying users based on their affinities. For example: getting all the emails of users who seem to be engineers.

And finally, I was also working on modularizing our project: the modules now seem to be working properly (based on the results we got from our test suites), so I made a new version of Rutilus for our industry partner using the modules we built (instead of the monolithic application we had before). Soon enough, we will deploy this new version for them.

cdot rutilus 

Making a test suite for the front-end client

:

Test cases are a very important tool to have: they help us to know whether or not the program is working properly, and they also assure that future updates will not break what was already done. Making these tests on the backend of a program is relatively simple and common, but when we need to test the user interface, it gets a little more complicated. For our project, we have a JavaScript code that goes in the browser (the Observer), and we need to make sure all its functions that depend on the user input are working properly. In this post I will explain how I used Webdriver.io to simulate user interactions for our client module.

Selenium is an application that automates your browser - it provides the tools that we need to create the user interactions, while Webdriver.io is a Selenium ported for Node.js; but to use Webdriver.io, we need selenium installed. It was simple enough, just using these commands:

$ curl -O http://selenium-release.storage.googleapis.com/3.0/selenium-server-standalone-3.0.1.jar
$ curl -L https://github.com/mozilla/geckodriver/releases/download/v0.11.1/geckodriver-v0.11.1-linux64.tar.gz | tar xz
$ java -jar -Dwebdriver.gecko.driver=./geckodriver selenium-server-standalone-3.0.1.jar

And done, Selenium is installed and running.

Now, to install Webdriver.io in our node project, we can simply use NPM:

npm install --save-dev webdriverio

And we are ready to go.

In a javascript file - the one we used to run the routines in the browser, we configure and start Webdriver.io this way:

const webdriverio = require('webdriverio');

const options = {
    desiredCapabilities: { browserName: 'firefox' }
};

It is a very simple configuration in our case, since we want to use Firefox to run the test cases, and no other options were necessary. We can also use PhantomJS (an "invisible browser", but I think it is useful to have Firefox appearing, so we can inspect what exactly is happening).

Now to start Webdriver.io and make it go to our test page:

webdriverio
    .remote(options)
    .init()
    .url('file:///<path>/file.html')

When we run this javascript file, Firefox will pop up and open that file specified.

Now some commands we used to simulate user interaction:

webdriverio
    .remote(options)
    .init()
    .url('file:///<path>/file.html')

    // Telling Webdriver.io to click on a link (<a> tag) with the id "my-link":
    .click('a#my-link')

    // Scrolling 300 pixels to the bottom and to the right
    .scroll(300, 300)

    // Filling an input (<input> tag) with the class "my-input" (the first input with this class, in this case) with the value "text""
    .setValue('input.my-input', 'text')

    // Selecting the third option of a select box (<select> tag) with the id "my-select"
    .selectByIndex('select#my-select', 2)

    // Pausing for 2 seconds
    .pause(2000)

    // Executing a JavaScript function in the browser and logging "Hello world" in the browser's console
    .execute(function () {
        console.log('Hello world');
    })

    // Closing the browser and ending the tests
    .end();

That's it! Using Webdriver.io was surprisingly simple, and I was able to complete all the tests in a few hours.

Some commands were not available by Webdriver.io, for example: selecting a text on the page. But I could overcome this using the .execute command and passing a function to select the text I wanted directly from the browser:

.execute(function () {
    // Element with the text to be selected
    const element = document.getElementById('div-with-text-to-select');

    // Creating a range around the element and setting it as the selection
    const range = document.createRange();
    range.setStartBefore(element);
    range.setEndAfter(element);
    window.getSelection().addRange(range);
})

I was also able to fire custom events in the browser using the .execute command:

.execute(function () {
    // I want to fire the event from this element
    const element = document.getElementById('source-of-event');

    // Creating my custom event
    const event = new CustomEvent('eventName');

    // Firing
    element.dispatchEvent(event);
})

cdot javascript frontend test selenium webdriver 

Open-sourcing our project

:

With the goals of our project almost reached, we decided to go one step forward: we will open-source our whole application (except for the proprietary details). In this post I will describe how we will do this and what we have done so far.

First, what is our project about? In short: our project is a more flexible and customizable version of Google Analytics. It records data from the users, and gives you tools to analyze it. This is what it is capable of recording, by default (among others):

  • When the user opens and closes the page
  • How far the user scrolled in the page
  • What links the user clicked
  • What texts the user selected
  • What texts and images the user copied
  • Which ads the user clicked
  • What forms the user started to fill and what forms they completed
  • What parts of the website the user shared and/or followed on social media
  • If the user commented or not
  • Miscellaneous information about the page the user visited (title, content, number of words, number of images, number of videos, number of paragraphs, etc)
  • User location (latitude, longitude, city, country...)
  • If the user had cookies enabled or not
  • The user's operating system, device, browser, browser engine, and cpu
  • Miscellaneous information about the user (in case they are logged in: name, email, etc)

This is the data that we record. Now, here is a problem: what do we do with all this information? Our tool also gives the website owner some ways to analyze it. Based on all that information, we can provide you answers to questions like:

  • How far are users scrolling on the articles of my website, for the author X?
  • How long do they stay on the articles before they close the page?
  • Is there a relation between the number of words/images and the time spent on the page?
  • What are the most successful articles, based on the number of shares and time spent on page?
  • How popular are the sections and authors in my website?
  • What is the optimal length for the articles?

We also provide a graphical representation of a timeline, in order to see all the actions a user did on the website, for example:

  • 12:00 Visited the home page
  • 12:03 Clicked the "Articles" link
  • 12:12 Scrolled to the middle of the page
  • 12:24 Scrolled to the bottom of the page
  • 12:29 Shared on facebook
  • 12:31 Closed the page

We provide these pre-made questions for the owner of the website, but you can also make your own using a graphical tool we made (we wanted to make it as user friendly as possible). I explained how we made it in my article Simulating Inner Joins on MongoDB. The results can be easily exported as a .csv file and used on Microsoft Excel to make graphs and pivot tables!

Ok, but here is another thing we can do with this tool: we can profile users based on this information gathered. For example, say we have an optional field for registered users, in which they can provide their job: we can observe the user's browsing pattern and try to find similar users to him - if the says he is a "designer", and read a lot of articles about design, then maybe people who read mostly articles about design are also designers.

With this possibility, we can now target content and ads for people who "are or look like engineers". I made a blog post (User Affinity Tool: grouping and finding patterns for users) on how we accomplished this too.

So what is going to be the name of our project? Well, what it does kind of looks like a census, right? When we gather data about people in an area in order to profile them. This is why we decided to name our project Rutilus, after Gaius Marcius Rutilus - he was the first plebeian censor of ancient Rome.

Right, but how do we open-source something like this? This is what we are doing in order to make this happen:

Separating into modules

Our first problem is that our application was very modular, but still too monolithic for what we intended: we want to offer people the full, pre-configured package, that is ready to go and easy to configure, but we also want people to be able to make their own parts if they wish to do so. So we want our application to be like Lego bricks: you can pick the parts that you want, make your own (if you really want to), or just get the whole pack.

So we decided to separate it into 4 modules that are completely independent:

  • Observer: this is the module that goes into the client browse. It gathers the information from the user and sends via HTTP/WebSockets to the Logger module.

  • Logger: this is an API, it connects to a MongoDB database and pushes the data in it.

  • Analytics: this is the interface that allows you to analyze the data that we gather: it gives you the affinity tool, used to profile users based on their browsing patterns, and the dashboard, used to query the database.

  • Heartbeat: this is a small helper module: it sends constant requests to both the Logger and Analytics modules and record their response time. If we are packing these modules in a container, it also gives you the option to throw an error and crash the container (making it restart) if one of the modules stop responding.

Making each module easy to launch and configure

Here is a question: can we make the modules so easy to start that it only takes one line of code? Yes, we can. And this is what we are doing. The only thing you will need in order to launch a module is to import and run it, passing the settings to it. For example:

require('rutilus-logger-node')({ port: 8080, ... });

This would launch the Logger module in the port 8080. There are more settings required, but they will all be easy to configure.

I think one of the biggest problems I had when I wanted to try a new tool was the configuration: they asked me for something I had no idea where to find, and I had to read two pages of cryptic documentation that would not help me at all. We will make sure this doesn't happen.

We will also give people options to record custom fields: do you want to record the user's age? All it takes it another line of configuration and you are all set.

Publishing the modules on NPM

The convenience of having a package manager is, without a doubt, something we must take advantage of. Instead of making people download our source codes in order to run their project, they can just run a command on npm in order to install it:

npm install --save rutilus-logger-node

And they can start using them.

This required a bit of research from us: we wanted to make sure our package.json files were properly configured in order to launch our modules on NPM.

Packing everything in another ready-to-go module

Having the option to pick and choose our own modules is nice, but how about people who don't know Node.js and, just want to get it done, and need a quick solution? We will also provide one solution for that.

We don't know the details yet because we haven't stated working on it, but this is what I have in mind:

1- Download a small .zip file with an NPM project 2- Change one or two configuration files according to our documentation (again, it must be very easy to do) 3- Run a few commands to deploy it on Amazon Web Services or Heroku 4- Include the Observer module in your website

And done. This is what we are aiming for, and I am sure we can accomplish it.

cdot rutilus open-source