Imagine there's no countries

Hello everyone! I know, it's been months since the last time I wrote in this dev blog... But I haven't been staring at the ceiling or watching the grass grow, I made a lot of progress in Winds of Trade, I swear! I'll write about that later, though.

Today I want to share with all of you how I implemented the algorithm to put labels for the procedurally generated countries on the map.

First of all, let's start with the finished result so you can have a notion of what I'm talking about:

Those Urinians are everywhere!

So, basically I have a two-dimensional array where I store what country owns each square in the map and I had to generate labels for them. I wanted to achieve something similar to what Europa Universalis 4 does but using straight labels instead of curved ones in order to make implementation easier. So I needed an algorithm capable of:

1) Identifying every contiguous region in the map.
2) Finding the best orientation (rotation) for the label.
3) Deciding where to place the label.
4) Choosing the right size for the label.

It turns out problem #1 can be solved pretty easily using any of the commonly used fill algorithms. There are lots of examples out there in the web, but basically what you do is to take an initial tile and move to every contiguous tile with the same color until you cannot do it anymore, always remembering which tiles you've already visited so you do not visit them again. Then you pick another non-visited tile and keep going on and on until every tile in the whole map has been processed. Easy!

Problem #2 was driving me crazy, I couldn't think of any good ways to solve it, so I asked a friend of mine, who is the smartest guy I know, to help me and he suggested something brilliant. It's a very intuitive and easy to implement solution, just what I was looking for! He said "you could use a linear regression in order to find the slope of the straight line that best fits the shape of the country".

If you do not know what linear regression is you can probably take a look at the Wikipedia article I linked above. But basically it's a way to find the straight line that best fits a number of data points. Which is pretty much what we need here!

But an image is worth a thousand words:

First you use the tiles that make up your region as a dataset...

Once you have all the tiles of your region as a dataset you apply linear regression and you get a beautiful straight line. The slope of this line will be the orientation of your label:

... and via linear regression you get a beautiful slope.

So that pretty much solves problem #2. You might be wondering "what happens if I have a region that's pretty much a vertical line?". In that case you are going to get a horizontal label that goes across the width of the country instead of its length, and that isn't what we want... So in order to avoid that problem I make two linear regressions, one using the tile data as (x,y) and one as (y,x). I then stay with the one that allows me to make the longest label. Not very elegant but it works!

That leaves us with problems #3 (position) and #4 (size). It turns out they are relatively easy to solve. The best position can be found quickly by brute force near the central points of the region. You try a few possible positions near the center and then keep the one that allows for the longest uninterrupted chain to either side using the orientation obtained from step #2. Choosing a size can be done easily knowing the length of the chain calculated in step #3 and the length of the name of the country.

The finished label

You then repeat these steps for every region in the map (over a certain size threshold) and you're good to go! I hope someone will find this - very incomplete and poorly written - information useful. Thanks for reading!