Build a bot using Selenium WebDriver with Ruby
In this week’s blog, we will explore the exciting world of automating tasks with Selenium WebDriver. We will start with a brief introduction to some of WebDriver’s common uses, installing the needed tools in our local machine, and practice writing a few simple WebDriver scripts in Ruby. I will be using my fellow developer Matthew Aquino’s StoX website to demonstrate the codes.
From the Selenium official website, the primary usage for a WebDriver is to “create browser-based automation suites and tests”. If you’re an active console gamer trying to purchase either a PlayStation 5 or an Xbox Series X in recent months, but somehow each time when you pressed that Place Order button, the following page popped up… no doubt you have heard of the infamous bots. That’s one example of what a WebDriver script does: emulating a series of user actions and automating tasks.

Before we can write our scripts, let’s install Ruby. If you’re on a Windows machine, Ruby Installer is a one click download and install process. For Mac users, Ruby is likely already installed, you can confirm by running ruby -v on your terminal. Otherwise, run the command \curl -sSL https://get.rvm.io | bash -s stable — ruby to install RVM with Ruby.
We will need a code editor to write our scripts in and a browser. I will be using VS Code, but any other editor such as Atom or Sublime is perfectly fine. The browser I will be using is Google Chrome, but Firefox, Edge, Safari, etc, all work, I will point out the small syntax differences as we go along.
Finally we will need two Ruby gems. The first one is Selenium-Webdriver, you can install the gem by running the below command from a Terminal: gem install selenium-webdriver
Next is the Chromedriver-helper . Run this command: gem install chromedriver-helper
Let’s start by creating a project folder named webdriver-project, and inside it, a bot.rb file.

We first import the gem files in lines 1 and 2. Then create an instance of WebDriver and named the variable driver. If your browser is not Chrome, you can simply replace the symbol :chrome with :firefox or :ie or your browser. On line 5, we tell the driver to navigate to a specific url, in this case, the Stox website.
After this point, the process is fairly straightforward, we first find an HTML element within the page (such as a text field, a button, an image, radio buttons, or checkboxes), then we perform some action on that element (like entering some text, click on it, select one of its options). And do the same for another element until we are done.
Similar to JavaScript’s document.findElementBy methods, Selenium has a driver.find_element() methods and we can pass in key value pairs to tell the driver how to find an element.
If we go to the Stox website, and inspect on the search bar on top, from the developer tools, we can see its properties are: <input placeholder=”Input stock symbol” type=”text” value=””>
Since the input does not have any ID or name, we will locate it by its tag name like so:
input_field = driver.find_element(tag_name: “input”)
Notice if it had a unique id or name, we could use something like:
input_field = driver.find_element(id: “its_unique_id) or
input_field = driver.find_element(name: “its_unique_name)
Now let’s pass in some text inputs and a keyboard stroke to the field.

Notice if you manually enter TSLA in the search bar without pressing enter, the website does not automatically load any new information, so we will need to send in an enter key stroke as well.
At this point our simple script is fully functional, let’s add a line to quit the driver and run the script from our Terminal.
driver.quit

As you can see, when you run the script, Chrome will launch automatically, navigate to our website, enter the text “TSLA” and return key, and is closed immediately.
For our next script, let’s create a script to make an account and log in.

When we inspect the Login button on the homepage, we can see it’s an a tag with a class of item. Unfortunately we can’t easily locate this element, however its inner text (Login) is unique.

After we click on the Login button, we again see a popup window that we can sign up. Upon inspecting its properties, we know it’s button with a class of ui and button. This time we will grab it by its xpath like so, since its properties still aren’t unique enough:


We’re greeted with another screen after clicking on the Sign up button, and this time we can enter a username, password, and the final Sign up button, the fields are its own HTML elements:
<input placeholder=”Desired Name” name=”name” type=”text” value=””>
<input placeholder=”Desired Password” name=”password” type=”password” value=””>
<button type=”submit” class=”ui button”>Sign up</button>
We can easily locate the input fields by its names, and we will use the same strategy as earlier to select the button by its xpath.

Now let’s test our program!

The final code for today’s blog can be downloaded at my Github page. The documentation for Selenium is at this link. And finally, Ms.Meaghan Lewis from Github has a mini video series on introduction to Selenium WebDriver with Ruby that’s very intuitive to follow, I would highly recommend watching!
See you next week!