16 Jul2013

Acquisition Coverage

Posted by Matt Mazur (@mhmazur)

The news of Automattic's acquisition of Lean Domain Search spread faster and wider than I ever would have imagined.

Here are a few of the news outlets that covered it:

Thank you all for your support!

Comments

15 Jul2013

Lean Domain Search Acquired by Automattic

Posted by Matt Mazur (@mhmazur)

I’m thrilled to announce that Lean Domain Search has been acquired by Automattic, the company behind WordPress.com!

At Automattic I’ll be working full time on making it even easier for WordPress.com users to find and register great domain names for their websites and blogs.

What does this mean for Lean Domain Search? Not only will it continue to run, but it’s also now completely free to use. There is even now an option in Lean Domain Search to register domain names directly through WordPress.com, making it easier than ever to start a website with a great domain name.

I’d like to extend a special thanks to my wife, my family and friends, the Orlando tech community, and Lean Domain Search’s users for their tireless support and feedback. Lean Domain Search wouldn’t be what it is without all of you.

If you're new to Lean Domain Search and want to check it out for yourself, you're welcome to head on over to the homepage to perform your own search. Cheers!

Matt Mazur
Founder, Lean Domain Search
matt@leandomainsearch.com | @mhmazur

Comments

16 May2013

An Inside Look at Lean Domain Search's Brandable Domain Name Generation Algorithm

Posted by Matt Mazur (@mhmazur)

At the end of March I launched Lean Domain Search's new Brandable Domain Names section. Brandable domain names, for those of you not familiar with the term, are domain names that can be used for a wide variety of websites. Think names like Obsera and Innoviza. The names don't convey the site's purpose so they can branded to use for pretty much anything. Since the launch over 1,000 brandable domain names have been released at a rate of 1 per hour. At the time of this writing, almost 20% of them have been registered.

I've received a few emails about how I generate these domain names so I figured I'd write up a short blog post explaining the process. It's slightly complicated, but hopefully by the end you will have a pretty good idea for how it works. This tutorial won't contain any code, though you are free to implement the algorithm on your own if you'd like to experiment with it.

How it Works

The key to generating good brandable domain names is to ensure that they are pronounceable. This is easier said than done though. If you throw a bunch of letters to together randomly you'll more than likely wind up with something that is entirely unpronounceable. What you need is some list of letter combinations that can be pronounced easily. While not comprehensive, a standard English dictionary is a great place to start.

What if we took every English word that ends in US and replaced the US with an A?

The English word list might look like this:

1.abacus5.adieus9.airbus
2.abstemious6.adulterous10.alumnus
3.acanthus7.advantageous11.ambidextrous
4.acrimonious8.adventurous12.ambiguous

Replacing the trailing US's with A's changes the names to:

1.abaca5.adiea9.airba
2.abstemioa6.adulteroa10.alumna
3.acantha7.advantageoa11.ambidextroa
4.acrimonioa8.adventuroa12.ambiguoa

By generating domain names based on English words that we already know are pronounceable, we've managed to generate a list of domain name ideas that are also mostly pronounceable. There are some exceptions: if the original word ended in IUS or OUS then it now ends in IOA or OA which is not very pronounceable, but we can add some rules that say to disregard those when coming up with the actual list.

Using a dictionary of common English words is a good start, but it's somewhat limited. In the list of common English words that I am using, there are 602 words that end in US. Not bad, but not a huge number either. Is there somewhere else we can look?

Using the Zone File for Inspriation

Enter VeriSign's zone file. This list, published daily, contains most of the registered .com domain names in existence. With over 100 million domain names and counting, this is an invaluable resource for generating domain name ideas. For every registered domain name out there, someone decided that it was good enough to pay money for which means that it is more likely than not to be pronounceable. By looking at existing domain names and making slight modifications we can generate domain names of our own that are also likely produceable. There are a few things we need to do first though.

The zone file that I am using contains 609,888 .com domain names that end in US. If all we did was replace the trailing US with an A for all of those domain names you'd wind up with a pretty bad list of domain name ideas. For example, the original list would contain domain names such as:

1.telcoplus5.damaus9.lifehaus
2.andreassophocleous6.tutututus10.faziosworldfamous
3.spyralplus7.unique2us11.janjakstatkus
4.ourscampus8.guerillamarketingplus12.bookious

Replacing the trailing US with A's results in:

1.telcopla5.damaa9.lifehaa
2.andreassophocleoa6.tutututa10.faziosworldfamoa
3.spyralpla7.unique2a11.janjakstatka
4.ourscampa8.guerillamarketingpla12.bookioa

Not a bad start, but there are still a lot of problems: some are not pronounceable, some contain numbers, and others contain words that make them unusable as company names (guerillamarketingpla will be read as Guerilla Marketing Pla, for example — a name no one would want to use). Some of these issues can be mitigated with various rules: only look at domain names of a certain length, ignore ones with numbers and dashes, etc, but you still have a problem that some convey meaning. ourcampa would pass all of our rules, but it's still not a good domain name. What to do?

The Importance of Common Roots

The root of a domain name is the part of the domain name that does not contain its suffix or prefix. For example, in a domain name like github, hub is the suffix and git is the root. In a domain name like ourcampus, our is the prefix and campus is the root. What if instead we said that ourcamp is the root and us is the suffix? That's not what most of would consider the root and suffix, but bear with me for a second.

What if you looked at all of the roots for domain names that end in US and compared it to the roots of all domain names that end in, say, IS. By looking at the roots that they have in common, we'll likely end up with a list of pretty good list of pronounceable roots. And because the roots are registered with multiple suffixes, there's a good chance that it doesn't have an actual meaning (for example, guerillamarketingplus would be a result for US, but guerillamarketingplis with an IS is unlikely to be registered so guerillamarketingpl wouldn't make the list of common roots).

There are 609,888 .com domain names that end in US which means there are 609,688 roots for those domain names. There are 540,887 .com domain names that end with IS and therefore 540,887 roots. Of those there are 26,782 roots in common. By adding a new suffix such as A to these common roots, you wind up with some pretty good domain name ideas. If you restrict it to results that are between 5 and 9 characters (all 1, 2, 3, and 4 letter .coms are registered and 10 or more for a brandable name tends to be too long), remove all of the names that contain numbers and dashes, and apply a few rules (no domain names that end in UA, YA, ZA, etc) you can reduce the list to a mere 10,021 domain name ideas. These domains include:

1.sitella5.mymana9.ideasa
2.vizala6.mirsala10.sponsora
3.atoura7.greva11.latra
4.applieda8.igena12.spirala

To drive the point home: if you replace the trailing A with a US or an IS then those domain names are registered. For example, the presence of Vizala means that Vizalus and Vizalis are registered.

At this point we have a list of domain name ideas but we haven't checked to see which are still available to register. After running them through an availability checking script, we're left with 1,211 domain names. These include:

1.quantila5.innovira9.relatora
2.arvenda6.holdena10.primorda
3.prodova7.cypera11.tacticala
4.netixa8.vangela12.ubiquida

Not bad, right?

The final step for me is to manually review these results. As good as this method is at generating available brandable domain names, it still comes up with some bad names so I wind up reviewing the results and selecting which ones to add to Lean Domain Search.

By playing with which suffixes it checks the roots for and what new suffixes to add to the common roots, this algorithm can be used to generate thousands of great available domain names.

Summary

To recap, here's how the algorithm works:

  1. Determine all of the registered .com domain names that end with specific suffixes (US and IS in this case)
  2. Figure out the roots for those domain names and determine which ones they have in common
  3. Add a new suffix to those roots (A in this case)
  4. Programmatically remove domain names that indicate low quality (numbers, dashes, length, certain letter combinations, etc)
  5. Check which of those domain names are available
  6. Manually review the results for quality

If you have any questions, feedback, or ideas on how to improve it, please feel free to leave a comment or email me at matt@leandomainsearch.com.

Thanks!
Matt

View comments on HackerNews

Comments

13 May2013

Improving Lean Domain Search's Trending Topics Section

Posted by Matt Mazur (@mhmazur)

At the end of January I launched the Domain Name Trends section on Lean Domain Search which allows you to research trends in .com domain name registrations.

I planned on launching on a Monday (launching late in the week is rarely a good idea), but finished everything up a few days early so I spent the weekend prior to launch hastily implementing an additional feature that would automatically identify trending topics each day. The daily reports that it generated were not bad, but not great either:

I was never too happy with the quality of this report, but I got involved with some other higher priority projects and so improving it got put on the back burner for a few months. Finally, about two weeks ago, I started working on a new version. Here's the result:

View the actual report.

The three major change are:

Weekly reports vs daily reports

Despite something like 100,000 new .com domain name registrations each day, it's hard to identify meaningful trends using a single day's worth of data. When you think "domain name trends" you probably imagine something on a larger timescale anyway which is why I decided to eliminate the daily trend reports and replace them with weekly trend reports. If you're a domain name investor, these reports should provide a lot more value than the daily ones because they identify trending topics over a longer time period. In the future I might extend the analysis to include monthly reports too.

If you'd like to view all of the weekly reports, you can head on over to the trending topics archive. I went back and recalculated the trending topics for each week in 2013 so you can go back in time and see what topics folks were registering domain names for.

Domain names are no longer listed on the report

In the old version you could see a list of the new domain names below each list item. I originally thought this would be helpful, but over time I came see how it bloated the report. In this new version you have to go to the trending topic page for that topic to view the actual domain names.

New trending topic identification algorithms

In the past, the daily trend reports looked at the quantity of domain names registered for a certain topic to identify and rank trends. A better way, and the one that the new weekly trend reports use, is to look at the percentage change from week to week. This method tends to produce much more accurate results. For example, the top trending topic from last week was fucai which saw a 6,500% increases from the week prior (the purpose of these registrations is a topic for another blog post).

One final note: I recently changed trend result pages to show the last six months worth of trends instead of all the trends since January 1, 2013 in order to improve its speed. I now see that the shorter time period, while faster, is less useful so I am changing it to show the last year's worth of data instead. The results should still be quite fast and will be much more informative.

Hope you enjoy these changes —

Matt

Comments

Welcome to Lean Domain Search, a fast new domain name generator.

You can read our blog or subscribe via RSS to stay up to date on Lean Domain Search news, features, and more.

Subscribe to our feed

View blog post archive

Share the Love

If you find Lean Domain Search useful, please help spread the word: