Site Update

I just realised that the tagline of this site still stated “MapBasic”, amongst other things. It’s been a good 4+ years since I last touched that abomination of a language, so it’s well and truly time for that tag to go!

To all those still stuck using MapInfo/MapBasic… you have my condolences.

Thoughts on “FOSS4G/SOTM Oceania 2018”, and the PyQGIS API improvements which it caused

Last week the first official “FOSS4G/SOTM Oceania” conference was held at Melbourne University. This was a fantastic event, and there’s simply no way I can extend sufficient thanks to all the organisers and volunteers who put this event together. They did a brilliant job, and their efforts are even more impressive considering it was the inaugural event!

Upfront — this is not a recap of the conference (I’m sure someone else is working on a much more detailed write up of the event!), just some musings I’ve had following my experiences assisting Nathan Woodrow deliver an introductory Python for QGIS workshop he put together for the conference. In short, we both found that delivering this workshop to a group of PyQGIS newcomers was a great way for us to identify “pain points” in the PyQGIS API and areas where we need to improve. The good news is that as a direct result of the experiences during this workshop the API has been improved and streamlined! Let’s explore how:

Part of Nathan’s workshop (notes are available here) focused on a hands-on example of creating a custom QGIS “Processing” script. I’ve found that preparing workshops is guaranteed to expose a bunch of rare and tricky software bugs, and this was no exception! Unfortunately the workshop was scheduled just before the QGIS 3.4.2 patch release which fixed these bugs, but at least they’re fixed now and we can move on…

The bulk of Nathan’s example algorithm is contained within the following block (where “distance” is the length of line segments we want to chop our features up into):

for input_feature in enumerate(features):
    geom = feature.geometry().constGet()
    if isinstance(geom, QgsLineString):
        continue
    first_part = geom.geometryN(0)
    start = 0
    end = distance
    length = first_part.length()

    while start < length:
        new_geom = first_part.curveSubstring(start,end)

        output_feature = input_feature
        output_feature.setGeometry(QgsGeometry(new_geom))
        sink.addFeature(output_feature)

        start += distance
        end += distance

There’s a lot here, but really the guts of this algorithm breaks down to one line:

new_geom = first_part.curveSubstring(start,end)

Basically, a new geometry is created for each trimmed section in the output layer by calling the “curveSubstring” method on the input geometry and passing it a start and end distance along the input line. This returns the portion of that input LineString (or CircularString, or CompoundCurve) between those distances. The PyQGIS API nicely hides the details here – you can safely call this one method and be confident that regardless of the input geometry type the result will be correct.

Unfortunately, while calling the “curveSubstring” method is elegant, all the code surrounding this call is not so elegant. As a (mostly) full-time QGIS developer myself, I tend to look over oddities in the API. It’s easy to justify ugly API as just “how it’s always been”, and over time it’s natural to develop a type of blind spot to these issues.

Let’s start with the first ugly part of this code:

geom = input_feature.geometry().constGet()
if isinstance(geom, QgsLineString):
    continue
first_part = geom.geometryN(0)
# chop first_part into sections of desired length
...

This is rather… confusing… logic to follow. Here the script is fetching the geometry of the input feature, checking if it’s a LineString, and if it IS, then it skips that feature and continues to the next. Wait… what? It’s skipping features with LineString geometries?

Well, yes. The algorithm was written specifically for one workshop, which was using a MultiLineString layer as the demo layer. The script takes a huge shortcut here and says “if the input feature isn’t a MultiLineString, ignore it — we only know how to deal with multi-part geometries”. Immediately following this logic there’s a call to geometryN( 0 ), which returns just the first part of the MultiLineString geometry.

There’s two issues here — one is that the script just plain won’t work for LineString inputs, and the second is that it ignores everything BUT the first part in the geometry. While it would be possible to fix the script and add a check for the input geometry type, put in logic to loop over all the parts of a multi-part input, etc, that’s instantly going to add a LOT of complexity or duplicate code here.

Fortunately, this was the perfect excuse to improve the PyQGIS API itself so that this kind of operation is simpler in future! Nathan and I had a debrief/brainstorm after the workshop, and as a result a new “parts iterator” has been implemented and merged to QGIS master. It’ll be available from version 3.6 on. Using the new iterator, we can simplify the script:

geom = input_feature.geometry()
for part in geom.parts():
    # chop part into sections of desired length
    ...

Win! This is simultaneously more readable, more Pythonic, and automatically works for both LineString and MultiLineString inputs (and in the case of MultiLineStrings, we now correctly handle all parts).

Here’s another pain-point. Looking at the block:

new_geom = part.curveSubstring(start,end)
output_feature = input_feature
output_feature.setGeometry(QgsGeometry(new_geom))

At first glance this looks reasonable – we use curveSubstring to get the portion of the curve, then make a copy of the input_feature as output_feature (this ensures that the features output by the algorithm maintain all the attributes from the input features), and finally set the geometry of the output_feature to be the newly calculated curve portion. The ugliness here comes in this line:

output_feature.setGeometry(QgsGeometry(new_geom))

What’s that extra QgsGeometry(…) call doing here? Without getting too sidetracked into the QGIS geometry API internals, QgsFeature.setGeometry requires a QgsGeometry argument, not the QgsAbstractGeometry subclass which is returned by curveSubstring.

This is a prime example of a “paper-cut” style issue in the PyQGIS API. Experienced developers know and understand the reasons behind this, but for newcomers to PyQGIS, it’s an obscure complexity. Fortunately the solution here was simple — and after the workshop Nathan and I added a new overload to QgsFeature.setGeometry which accepts a QgsAbstractGeometry argument. So in QGIS 3.6 this line can be simplified to:

output_feature.setGeometry(new_geom)

Or, if you wanted to make things more concise, you could put the curveSubstring call directly in here:

output_feature = input_feature
output_feature.setGeometry(part.curveSubstring(start,end))

Let’s have a look at the simplified script for QGIS 3.6:

for input_feature in enumerate(features):
    geom = feature.geometry()
    for part in geom.parts():
        start = 0
        end = distance
        length = part.length()

        while start < length:
            output_feature = input_feature
            output_feature.setGeometry(part.curveSubstring(start,end))
            sink.addFeature(output_feature)

            start += distance
            end += distance

This is MUCH nicer, and will be much easier to explain in the next workshop! The good news is that Nathan has more niceness on the way which will further improve the process of writing QGIS Processing script algorithms. You can see some early prototypes of this work here:

So there we go. The process of writing and delivering a workshop helps to look past “API blind spots” and identify the ugly points and traps for those new to the API. As a direct result of this FOSS4G/SOTM Oceania 2018 Workshop, the QGIS 3.6 PyQGIS API will be easier to use, more readable, and less buggy! That’s a win all round!

Tagged , , , , ,

The Inaugural QGIS Australia Hackfest – Noosa 2017

Last week we kicked off the first (of hopefully many) Australian QGIS hackfests Developers Meetings. It was attended by 3 of the core QGIS development team: Nathan Woodrow, Martin Dobias and myself (Nyall Dawson), along with various family members. While there’s been QGIS hackfests in Europe for over 10 years, and others scattered throughout various countries (I think there was a Japanese one recently… but Twitter’s translate tool leaves me with little confidence about this!), there’s been no events like this in the Southern hemisphere yet. I’ve been to a couple in Europe and found them to be a great way to build involvement in the project, for both developers and non-developers alike.

In truth the Australian hackfest plans began mostly an excuse for Nathan and I to catch up with Martin Dobias before he heads back out of this hemisphere and returns to Europe. That said, Nathan and I have long spoken about ways we can build up the QGIS community in Australia, so in many ways this was a trial run for future events. It was based it in Noosa, QLD (and yes, we did manage to tear ourselves away from our screens long enough to visit the beach!).

Nathan Woodrow (@NathanW2), myself (@nyalldawson), and Martin Dobias (@wonder-sk)

Here’s a short summary of what we worked on during the hackfest:

  • Martin implemented a new iterator style accessor for vertices within geometries. The current approach to accessing vertices in QGIS is far from optimal. You either have the choice of an inefficient methods (eg QgsGeometry.asPolyline(), asPolygon(), etc) which requires translations of all vertices to a different data structure (losing any z/m dimensional values in the process), or an equally inefficient QgsAbstractGeometry.coordinateSequence() method, which at least keeps z/m values but still requires expensive copies of every vertex in the geometry. For QGIS 3.0 we’ve made a huge focus on optimising geometry operations and vertex access is one of the largest performance killers remaining in the QGIS code. Martin’s work adds a proper iterator for the vertices within a geometry object, both avoiding all these expensive copies and also simplifying the API for plugins. When this work lands traversing the vertices will become as simple as
for v in geom.vertices():
   ... do something with the vertex!
  • Martin is also planning on extending this work to allow simple iteration over the parts and rings within geometries too. When this lands in QGIS we can expect to see much faster geometry operations.
  • Nathan fixed a long standing hassle with running standalone PyQGIS scripts outside of the QGIS application on Windows. In earlier versions there’s a LOT of batch file mangling and environment variable juggling required before you can safely import the qgis libraries within Python. Thanks to Nathan’s work, in QGIS 3.0 this will be as simple as just making sure that the QGIS python libraries are included in your Python path, and then importing qgis.core/gui etc will work without any need to create environment variables for OSGEO/GDAL/PLUGINS/etc. Anyone who has fought with this in the past will definitely appreciate this change, and users of Python IDEs will also appreciate how simple it is now to make the PyQGIS libraries available in these environments.
  • Nathan also worked on “profiles” support for QGIS 3.0. This work will add isolated user profiles within QGIS, similar to how Chrome handles this. Each profile has it’s own separate set of settings, plugins, etc. This work is designed to benefit both plugin developers and QGIS users within enterprise environments. You can read more about what Nathan has planned for this here.
  • I continued the ongoing work of moving long running interface “blocking” operations to background tasks. In QGIS 3.0 many of these tasks churn away in the background, allowing you to continue work while the operation completes. It’s been implemented so far for vector and raster layer saving, map exports to images/PDF (not composers unfortunately), and obtaining feature counts within legends. During the hackfest I moved the layer import which occurs when you drag and drop a layer to a destination in the browser to a background task.
  • On the same topic, I took some inspiration from a commit in Sourcepole’s QGIS fork and reworked how composer maps are cached. One of my biggest gripes with QGIS’ composer is how slow it is to work with when you’ve got a complex map included. This change pushes the map redrawing into a background thread, so that these redraws no longer “lock up” the UI. It makes a HUGE difference in how usable composer is. This improvement also allowed me to remove those confusing map item “modes” (Cache/Render/Rectangle) – now everything is redrawn silently in the background whenever required.
  • Lastly, I spent a lot of time on a fun feature I’ve long wanted in QGIS – a unified search “locator” bar. This feature is heavily inspired by Qt Creator’s locator bar. It sits away down in the status bar, and entering any text here fires up a bunch of background search tasks. Inbuilt searches include searching the layers within the current project (am I the only one who loses layers in the tree in complex projects!?), print layouts in the project, processing algorithms, and menu/toolbar actions. The intention here is that plugins will “take over” and add additional search functionality, such as OSM place names searching, data catalog searches, etc. I’m sure when QGIS 3.0 is released this will quickly become indispensable!

The upcoming QGIS 3.0 locator bar

Big thanks go out to Nathan’s wife, Stacey, who organized most of the event and without whom it probably would never have happened, and to Lutra Consulting who sponsored an awesome dinner for the attendees.

We’d love this to be the first of many. The mature European hackfests are attended by a huge swath of the community, including translators, documentation writers, and plugin developers (amongst others). If you’ve ever been interested in finding out how you can get more involved in the project it’s a great way to dive in and start contributing. There’s many QGIS users in this part of the world and we really want to encourage a community of contributors who “give back” to the project. So let Nathan or myself know if you’d be interested in attending other events like this, or helping to organize them locally yourself…

Tagged , , , ,

About label halos

A lot of cartographers have a love/hate relationship with label halos. On one hand they can be an essential technique for improving label readability, especially against complex background layers. On the other hand they tend to dominate maps and draw unwanted attention to the map labels.

In this post I’m going to share my preferred techniques for using label halos. I personally find this technique is a good approach which minimises the negative effects of halos, while still providing a good boost to label readability. (I’m also going to share some related QGIS 3.0 news at the end of this post!)

Let’s start with some simple white labels over an aerial image:

These labels aren’t very effective. The complex background makes them hard to read, especially the “Winton Shire” label at the bottom of the image. A quick and nasty way to improve readability is to add a black halo around the labels:

Sure, it’s easy to read the labels now, but they stand out way too much and it’s difficult to see anything here except the labels!

We can improve this somewhat through a better choice of halo colour:

This is much better. We’ve got readable labels which aren’t too domineering. Unfortunately the halo effect is still very prominent, especially where the background image varies a lot. In this case it works well for the labels toward the middle of the map, but not so well for the labels at the top and bottom.

A good way to improve this is to take advantage of blending (or “composition”) modes (which QGIS has native support for). The white labels will be most readable when there’s a good contrast with the background map, i.e. when the background map is dark. That’s why we choose a halo colour which is darker than the text colour (or vice versa if you’ve got dark coloured labels). Unfortunately, by choosing the mid-toned brown colour to make the halos blend in more, we are actually lightening up parts of this background layer and both reducing the contrast with the label and also making the halo more visible. By using the “darken” blend mode, the brown halo will only be drawn for pixels were the brown is darker then the existing background. It will darken light areas of the image, but avoid lightening pixels which are already dark and providing good contrast. Here’s what this looks like:

The most noticeable differences are the labels shown above darker areas – the “Winton Shire” label at the bottom and the “Etheridge Shire” at the top. For both these labels the halo is almost imperceptible whilst still subtly doing it’s part to make the label readable. (If you had dark label text with a lighter halo color, you can use the “lighten” blend mode for the same result).

The only issue with this map is that the halo is still very obvious around “Shire” in “Richmond Shire” and “McKinlay” on the left of the map. This can be reduced by applying a light blur to the halo:

There’s almost no loss of readability by applying this blur, but it’s made those last prominent halos disappear into the map. At first glance you probably wouldn’t even notice that there’s any halos being used here. But if we compare back against the original map (which used no halos) we can see the huge difference in readability:

Compare especially the Winton Shire label at the bottom, and the Richmond Shire label in the middle. These are much clearer on our tweaked map versus the above image.

Now for the good news… when QGIS 3.0 is released you’ll no longer have to rely on an external illustration/editing application to get this effect with your maps. In fact, QGIS 3.0 is bringing native support for applying many types of live layer effects to label buffers and background shapes, including blur. This means it will be possible to reproduce this technique directly inside your GIS, no external editing or tweaking required!

Tagged , , , , , ,

New map coloring algorithms in QGIS 3.0

It’s been a long time since I last blogged here. Let’s just blame that on the amount of changes going into QGIS 3.0 and move on…

One new feature which landed in QGIS 3.0 today is a processing algorithm for automatic coloring of a map in such a way that adjoining polygons are all assigned different color indexes. Astute readers may be aware that this was possible in earlier versions of QGIS through the use of either the (QGIS 1.x only!) Topocolor plugin, or the Coloring a map plugin (2.x).

What’s interesting about this new processing algorithm is that it introduces several refinements for cartographically optimising the coloring. The earlier plugins both operated by pure “graph” coloring techniques. What this means is that first a graph consisting of each set of adjoining features is generated. Then, based purely on this abstract graph, the coloring algorithms are applied to optimise the solution so that connected graph nodes are assigned different colors, whilst keeping the total number of colors required minimised.

The new QGIS algorithm works in a different way. Whilst the first step is still calculating the graph of adjoining features (now super-fast due to use of spatial indexes and prepared geometry intersection tests!), the colors for the graph are assigned while considering the spatial arrangement of all features. It’s gone from a purely abstract mathematical solution to a context-sensitive cartographic solution.

The “Topological coloring” processing algorithm

Let’s explore the differences. First up, the algorithm has an option for the “minimum distance between features”. It’s often the case that features aren’t really touching, but are instead just very close to each other. Even though they aren’t touching, we still don’t want these features to be assigned the same color. This option allows you to control the minimum distance which two features can be to each other before they can be assigned the same color.

The biggest change comes in the “balancing” techniques available in the new algorithm. By default, the algorithm now tries to assign colors in such a way that the total number of features assigned each color is equalised. This avoids having a color which is only assigned to a couple of features in a large dataset, resulting in an odd looking map coloration.

Balancing color assignment by count – notice how each class has a (almost!) equal count

Another available balancing technique is to balance the color assignment by total area. This technique assigns colors so that the total area of the features assigned to each color is balanced. This mode can be useful to help avoid large features resulting in one of the colors appearing more dominant on a colored map.

Balancing assignment by area – note how only one large feature is assigned the red color

The final technique, and my personal preference, is to balance colors by distance between colors. This mode will assign colors in order to maximize the distance between features of the same color. Maximising the distance helps to create a more uniform distribution of colors across a map, and avoids certain colors clustering in a particular area of the map. It’s my preference as it creates a really nice balanced map – at a glance the colors look “randomly” assigned with no discernible pattern to the arrangement.

Balancing colors by distance

As these examples show, considering the geographic arrangement of features while coloring allows us to optimise the assigned colors for cartographic output.

The other nice thing about having this feature implemented as a processing algorithm is that unlike standalone plugins, processing algorithms can be incorporated as just one step of a larger model (and also reused by other plugins!).

QGIS 3.0 has tons of great new features, speed boosts and stability bumps. This is just a tiny taste of the handy new features which will be available when 3.0 is released!

Tagged , , , , ,

Speeding up your PyQGIS scripts

I’ve recently spent some time optimising the performance of various QGIS plugins and algorithms, and I’ve noticed that there’s a few common performance traps which developers fall into when fetching features from a vector layer. In this post I’m going to explore these traps, what makes them slow, and how to avoid them.

As a bit of background, features are fetched from a vector layer in QGIS using a QgsFeatureRequest object. Common use is something like this:

request = QgsFeatureRequest()
for feature in vector_layer.getFeatures(request):
    # do something

This code would iterate over all the features in layer. Filtering the features is done by tweaking the QgsFeatureRequest, such as:

request = QgsFeatureRequest().setFilterFid(1001)
feature_1001 = next(vector_layer.getFeatures(request))

In this case calling getFeatures(request) just returns the single feature with an ID of 1001 (which is why we shortcut and use next(…) here instead of iterating over the results).

Now, here’s the trap: calling getFeatures is expensive. If you call it on a vector layer, QGIS will be required to setup an new connection to the data store (the layer provider), create some query to return data, and parse each result as it is returned from the provider. This can be slow, especially if you’re working with some type of remote layer, such as a PostGIS table over a VPN connection. This brings us to our first trap:

Trap #1: Minimise the calls to getFeatures()

A common task in PyQGIS code is to take a list of feature IDs and then request those features from the layer. A see a lot of older code which does this using something like:

for id in some_list_of_feature_ids:
    request = QgsFeatureRequest().setFilterFid(id)
    feature = next(vector_layer.getFeatures(request))
    # do something with the feature

Why is this a bad idea? Well, remember that every time you call getFeatures() QGIS needs to do a whole bunch of things before it can start giving you the matching features. In this case, the code is calling getFeatures() once for every feature ID in the list. So if the list had 100 features, that means QGIS is having to create a connection to the data source, set up and prepare a query to match a single feature, wait for the provider to process that, and then finally parse the single feature result. That’s a lot of wasted processing!

If the code is rewritten to take the call to getFeatures() outside of the loop, then the result is:

request = QgsFeatureRequest().setFilterFids(some_list_of_feature_ids)
for feature in vector_layer.getFeatures(request):
    # do something with the feature

Now there’s just a single call to getFeatures() here. QGIS optimises this request by using a single connection to the data source, preparing the query just once, and fetching the results in appropriately sized batches. The difference is huge, especially if you’re dealing with a large number of features.

Trap #2: Use QgsFeatureRequest filters appropriately

Here’s another common mistake I see in PyQGIS code. I often see this one when an author is trying to do something with all the selected features in a layer:

for feature in vector_layer.getFeatures():
    if not feature.id() in vector_layer.selectedFeaturesIds():
        continue

    # do something with the feature

What’s happening here is that the code is iterating over all the features in the layer, and then skipping over any which aren’t in the list of selected features. See the problem here? This code iterates over EVERY feature in the layer. If you’re layer has 10 million features, we are fetching every one of these from the data source, going through all the work of parsing it into a QGIS feature, and then promptly discarding it if it’s not in our list of selected features. It’s very inefficient, especially if fetching features is slow (such as when connecting to a remote database source).

Instead, this code should use the setFilterFids() method for QgsFeatureRequest:

request = QgsFeatureRequest().setFilterFids(vector_layer.selectedFeaturesIds())
for feature in vector_layer.getFeatures(request):
    # do something with the feature

Now, QGIS will only fetch features from the provider with matching feature IDs from the list. Instead of fetching and processing every feature in the layer, only the actual selected features will be fetched. It’s not uncommon to see operations which previously took many minutes (or hours!) drop down to a few seconds after applying this fix.

Another variant of this trap uses expressions to test the returned features:

filter_expression = QgsExpression('my_field &gt; 20')
for feature in vector_layer.getFeatures():
    if not filter_expression.evaluate(feature):
        continue

    # do something with the feature

Again, this code is fetching every single feature from the layer and then discarding it if it doesn’t match the “my_field > 20” filter expression. By rewriting this to:

request = QgsFeatureRequest().setFilterExpression('my_field &gt; 20')
for feature in vector_layer.getFeatures(request):
    # do something with the feature

we hand over the bulk of the filtering to the data source itself. Recent QGIS versions intelligently translate the filter into a format which can be applied directly at the provider, meaning that any relevant indexes and other optimisations can be applied by the provider itself. In this case the rewritten code means that ONLY the features matching the ‘my_field > 20’ criteria are fetched from the provider – there’s no time wasted messing around with features we don’t need.

 

Trap #3: Only request values you need

The last trap I often see is that more values are requested from the layer then are actually required. Let’s take the code:

my_sum = 0
for feature in vector_layer.getFeatures(request):
    my_sum += feature['value']

In this case there’s no way we can optimise the filters applied, since we need to process every feature in the layer. But – this code is still inefficient. By default QGIS will fetch all the details for a feature from the provider. This includes all attribute values and the feature’s geometry. That’s a lot of processing – QGIS needs to transform the values from their original format into a format usable by QGIS, and the feature’s geometry needs to be parsed from it’s original type and rebuilt as a QgsGeometry object. In our sample code above we aren’t doing anything with the geometry, and we are only using a single attribute from the layer. By calling setFlags( QgsFeatureRequest.NoGeometry ) and setSubsetOfAttributes() we can tell QGIS that we don’t need the geometry, and we only require a single attribute’s value:

my_sum = 0
request = QgsFeatureRequest().setFlags(QgsFeatureRequest.NoGeometry).setSubsetOfAttributes(['value'], vector_layer.fields() )
for feature in vector_layer.getFeatures(request):
    my_sum += feature['value']

None of the unnecessary geometry parsing will occur, and only the ‘value’ attribute will be fetched and populated in the features. This cuts down both on the processing required AND the amount of data transfer between the layer’s provider and QGIS. It’s a significant improvement if you’re dealing with larger layers.

Conclusion

Optimising your feature requests is one of the easiest ways to speed up your PyQGIS script! It’s worth spending some time looking over all your uses of getFeatures() to see whether you can cut down on what you’re requesting – the results can often be mind blowing!

Tagged , , , ,

How to effectively get things changed in QGIS – a follow up

Last week I posted regarding some thoughts I’ve had recently concerning what I perceive as a general confusion about how QGIS is developed and how users can successfully get things to change in the project. The post certainly started a lot of conversation! However, based on feedback received I realise some parts of the posts were being misinterpreted and some clarification is needed. So here we go…

1. Please keep filing bug reports/feature requests

I don’t think I was very clear about this – but my original post wasn’t meant to be a discouragement from filing bug reports or feature requests. The truth is that there is a LOT of value in these reports, and if you don’t file a report then the QGIS team will never be aware of the bug or your feature idea. Here’s some reasons why you SHOULD file a report:

  • QGIS developers are a conscientious mob, and generally take responsibility for any regressions they’ve caused by changes they’ve made. In other words, there’s very much an attitude of “I-broke-it, I’ll-fix-it” in the project. So, if a new feature is buggy or has broken something else then filing bugs ASAP is the best way to make the developer aware of these issues. In my experience they’ll usually be addressed rapidly.
  • As mentioned in the original post – there’s always a pre-release bug fix sprint, so filing a bug (especially if it’s a critical one) may mean that it’s addressed during this sprint.
  • Filing feature requests can gain traction if your idea is innovative, novel, or interesting enough to grab a developer’s attention!

Speaking for myself, I regularly check new incoming tickets (at least once a day), and I know I’m not the only one. So filing a report WILL bring your issue to developer’s attention. Which leads to…

2. Frustration is understandable!

I can honestly understand why people get frustrated and resort to an aggressive “why hasn’t this been fixed yet?!” style reply. I believe that these complaints are caused because people have the misunderstanding that filing a bug report is the ONLY thing they can do to get an issue fixed. If filing a report IS the only avenue you have to get something fixed/implemented, then it’s totally understandable to be annoyed when your ticket gets no results. This is a failing on behalf of the project though – we need to be clearly communicating that filing a report is the LEAST you can do. It’s a good first step, but on its own it’s just the beginning and needs to be followed up by one of the methods I described in the initial post.

3. It applies to more than just code

When I wrote the original piece I focused on just the code aspect of the QGIS project. That’s only because I’m a developer and it’s the area I know best. But it applies equally across the whole project, including documentation, translations, infrastructure, websites, packaging QGIS releases, etc. In fact, some of these non-code areas are the best entry points into the project as they don’t require a development background, and eg the documentation and translation teams have done a great job making it easy to submit contributions. Find something missing in the QGIS documentation? Add it yourself! Missing a translation of the website which prevents QGIS adoption within your community? Why not sponsor a translator to tackle this task?!

4. It applies to more than just QGIS!

Again, I wrote the original piece focusing on QGIS because that’s the project I’m most familiar with. You could just as easily substitute GDAL, GEOS, OpenLayers, PostGIS, Geoserver, R, D3, etc… in and it would be equally valid!

Hopefully that helps clarify some of the points raised by the earlier article. Let’s keep the discussion flowing – I’d love to hear if you have any other suggestions or questions raised by this topic.

 

 

 

 

Tagged , ,

How to effectively get things changed in QGIS

I’ve been heavily involved in the open source QGIS mapping project for a number of years now. During this time I’ve kept a close watch on the various mailing lists, issue trackers, stackexchange, tweets and other various means users have to provide feedback to the project. Recently, I’ve started to come to the conclusion that there’s a lot of fundamental confusion about how the project works and how users can get changes made to the project. Read on for these insights, but keep in mind that these are just my thoughts and not reflective of the whole community’s views!..

Firstly – QGIS is a community driven project. Unlike some open source projects (and unlike the proprietary GIS offerings) there is no corporate backer or singular organisation directing the project. This means two things:

  1. The bad news: No-one will do your work for you. QGIS has been created through a mix of user-led contributions (ie, users who have a need to change something and dive in and do it themselves) and through commercially supported contributions (either organisations who offer commercial QGIS support pushing fixes because their customers are directly affected or because they’ve been contracted by someone to implement a particular change). There HAS been a number of volunteer contributions from developers who are just donating their time (for various reasons), but these contributions are very much the minority.
  2. The good news: YOU have the power to shape the project! (And whenever I say “you” – I’m referring directly to the person reading this, or the company you work for. Just pretend it’s in 24 point bold red blinking text.) Because QGIS is community driven (and not subject to the whims of any one particular enterprise) every user has the ability to implement changes and fixes in the program.

So how exactly can users get changes implemented in QGIS? Well, let’s take a look at all the possible different ways that changes get made and how effective each one is:

  1. YOU can make the changes yourself. This implies that you have the c++/Python skills required to make the changes, are able to find your way around the source code, and push through the initial hurdles of setting up a build environment and navigating git. This can be a significant time investment, but the ultimate result is that you can make whatever changes you want, and so long as your pull request is accepted you’ll get your changes directly into QGIS. You’ll find the QGIS team is very open to new contributors and will readily lend a hand if you need assistance navigating the source or for advise on the best way to make these changes. Just ask!
  2. YOU (or your employer) can pay (or “sponsor”) someone to make the changes on your behalf. Reinvesting some of those savings you’re making through using an open source program back into the program itself is a great idea, and everyone benefits. There’s numerous organisations who specialise in QGIS development (eg… my own consultancy, North Road). You can liaise with these organisations to get them to make the changes on your behalf. This is probably the most effective way of getting changes implemented. These organisations all have a history with QGIS development and this experience generally translates to much faster development then if you code it yourself. It’s also somewhat of a shortcut – if you hire a core QGIS developer to make your changes, then you can be confident that they are familiar with the coding style, policies, and long-term goals of the project and accordingly can get the changes accepted rapidly. The obvious down side of paying for changes is that, well, it costs money. Understandably, not everyone has the resources available to do this.
  3. Following on from option 2 – if you can’t directly sponsor changes yourself, you could help indirectly raise funds to pay for the changes. This is a great way to get changes implemented, because everyone has the power to do this. You could seek out similar organisations/users who have the same need and pool your resources, get involved with the local QGIS user group and raise funds together, organise a crowd-funding campaign, etc.
  4. Ask a developer to make the changes for you. This is not terribly effective – you’re basically asking someone to work for free, and take time away from their family/job/hobbies/social life to do work for you. That said, it does sometimes happen, and here’s a few reasons I can think of why:
    • You’ve build up enough “karma” within the project through other contributions. If someone has been heavily involved in the non-development side of the project (eg translations, documentation, helping users out on mailing lists/stackexchange, organising hackfests or user groups, etc) then developers are much more likely to want to help them out in turn.
    • You’ve got a fantastic idea which has just never occurred to anyone before. By bringing it to the attention of a developer you might trigger the “wow, I could really benefit from that too!” impulse which is hard-wired into some of us!
    • It’s a particularly interesting or challenging problem, and sometimes developers just like to extend themselves.
  5. (For bugs only) File a bug report, and hope it gets picked up in one of the pre-release bug fixing sprints. This is basically the same as option 2 – expect that in this case someone else (the QGIS steering committee) is paying for the development time. There’s no way of guaranteeing that your bug will get fixed during this time though, so it’s not a particularly reliable approach if the fix is critical for you.

Finally, there’s two more very ineffective approaches:

  1. File a bug report/feature request, and wait. This isn’t very effective, because what you’re doing is basically the same as 1-4 above, but just waiting for someone else to either do the work or sponsor the changes. This might happen in a week, or might take 10 years.
  2. Complain about something and hope for the best. This is… not very effective. No-one is particularly motivated to help out someone who is being a jerk.

That’s it. Those are the ONLY ways changes get made in QGIS. There’s no other magical short-cuts you can take. Some of these approaches are much more effective than others, and some require skills or resources which may not be available. If you want to see something change in QGIS, you need to take a look at these options and decide for yourself which best meets your needs. But please, just don’t choose option 7!

Update: a follow up to this article was published

Tagged , ,

Exploring variables in QGIS pt 3: layer level variables

In part 3 of my exploration of variables in QGIS 2.12, I’m going to dig into how variables are scoped in QGIS and what layer level variables are available (you can read parts 1 and 2 for a general introduction to variables).

Some background

Before we get to the good stuff, a bit of background in how variables work behind-the-scenes is important. Whenever an expression is evaluated in QGIS the context of the expression is considered. The context is built up from a set of scopes, which are all stacked on top of each other in order from least-specific to most-specific. It’s easier to explain with an example. Let’s take an expression used to set the source of a picture in a map composer. When this expression is evaluated, the context will consist of:

  1. The global scope, consisting of variables set in the QGIS options dialog, and other installation-wide properties
  2. The project scope, which includes variables set in the Project Properties dialog and the auto-generated project variables like @project_path, @project_title (you can read more about this in part 2)
  3. composer scope, with any variables set for the current composer, plus variables for @layout_pagewidth, @layout_pageheight, @layout_numpages, etc.
  4. composer item scope for the picture, with item-specific variables including @item_id

The more specific scopes will override any existing clashing variables from less specific scopes. So a global @my_var variable will be overridden by an @my_var variable set for the composer:

overridden

Another example. Let’s consider now an expression set for a data-defined label size. When this expression is evaluated the context will depend on where the map is being rendered. If it’s in the main map canvas then the context will be:

  1. The global scope
  2. The project scope
  3. map settings scope, with variables relating to how the map is being rendered. Eg @map_rotation, @map_scale, etc
  4. layer scope. More on this later, but the layer scope includes layer-level variables plus preset variables for @layer_name and @layer_id

If instead the map is being rendered inside a map item in a map composer, the context will be:

  1. The global scope
  2. The project scope
  3. The composer scope
  4. An atlas scope, if atlas is enabled. This contains variables like @atlas_pagename, @atlas_feature, @atlas_totalfeatures.
  5. composer item scope for the map item
  6. map settings scope (with scale and rotation determined by the map item’s settings)
  7. The layer scope

Using layer level variables

Ok, enough with the details. The reason I’ve explained all this is to help explain when layer level variables come into play. Basically, they’ll be available whenever an expression is evaluated inside of a particular layer. This includes data defined symbology and labeling, field calculator, and diagrams. You can’t use a layer-level variable inside a composer label, because there’s no layer scope used when evaluating this. Make sense? Great! To set a layer level variable, you use the Variables section in the Layer Properties dialog:

Setting a layer variablee

Setting a layer variable

Any layer level variables you set will be saved inside your current project, i.e. layer variables are per-layer and per-project. You can also see in the above screenshot that as well as the layer level variables QGIS also lists the existing variables from the Project and Global scopes. This helps show exactly what variables are accessible by the layer and whether they’ve been overridden by any scopes. You can also see that there’s two automatic variables, @layer_id and @layer_name, which contain the unique layer ID and user-set layer name too.

Potential use cases for layer-level variables

In the screenshot above I’ve set two variables, @class1_threshold and @class2_threshold. I’m going to use these to sync up some manual class breaks between rule based symbology and rule based labeling. Here’s how I’ve set up the rule-based symbols for the layer:

Rule based symbology using layer level variables

Rule based symbology using layer level variables

In a similar way, I’ve also created matching rule-based labeling (another new feature in QGIS 2.12):

Matching rule-based labels

Matching rule-based labels

Here’s what my map looks like now, with label and symbol colors matched:

*Map for illustrative purposes only... not for cartographic/visual design excellence!

*Map for illustrative purposes only… not for cartographic/visual design excellence!

If I’d hard-coded the manual class breaks, it would be a pain to keep the labeling and symbology in sync. I’d have to make sure that the breaks are updated everywhere I’ve used them in both the symbology and labeling settings. Aside from being boring, tedious work, this would also prevent immediate before/after comparisons. Using variables instead means that I can update the break value in a single place (the variables panel) and have all my labeling and symbols immediately reflect this change when I hit apply!

Another recent use case I had was teaming layer-level variables along with Time Manager. I wanted my points to falloff in both transparency and size with age, and this involved data defined symbol settings scattered all throughout my layer symbology. By storing the decay fall-off rate in a variable, I could again tweak this falloff by changing the value in a single place and immediately see the result. It also helps with readability of the data defined expressions. Instead of trying to decipher a random, hard-coded value, it’s instead immediately obvious that this value relates to a decay fall-off rate. Much nicer!

I’m sure there’s going to be hundreds of novel uses of layer-level variables which I never planned for when adding this feature. I’d love to hear about them though – leave a comment if you’d like to share your ideas!

One last thing – the new “layer_property” function

This isn’t strictly related to variables, but another new feature which was introduced in QGIS 2.12 was a new “layer_property” expression function. This function allows you to retrieve any one of a bunch of properties relating to a specific map layer, including the layer CRS, metadata, source path, etc.

This function can be used anywhere in QGIS. For instance, it allows you to insert dynamic metadata about layers into a print composer layout. In the screenshot below I’ve used expressions like layer_property(‘patron’,’crs’) and layer_property(‘patron’,’source’) to insert the CRS and source path of the “patron” layer into the label. If either the CRS or the file path ever changes, this label will be automatically updated to reflect the new values.

Inserting dynamic layer properties into a composer label

Inserting dynamic layer properties into a composer label

 

So there you go – layer level variables and the layer_property function – here in QGIS 2.12 and making your workflow in QGIS easier. In the final part of this series, we’ll explore the magical @value variable. Trust me, I’ve saved the best for last!

Tagged , , , ,

Exploring variables in QGIS pt 2: project management

Following on from part 1 in which I introduced how variables can be used in map composers, I’d like to now explore how using variables can make it easier to manage your QGIS projects. As a quick refresher, variables are a new feature in QGIS 2.12 which allow you to create preset values for use anywhere you can use an expression in QGIS.

Let’s imagine a typical map project. You load up QGIS, throw a bunch of layers on your map, and then get stuck into styling and labelling them ‘just right’. Over time the project gets more and more complex, with a stack of layers all styled using different rendering and labelling rules. You keep tweaking settings until you’re almost happy with the result, but eventually realise that you made the wrong choice of font for the labelling and now need to go through all your layers and labelling rules and update each in turn to the new typeface. Ouch.

Variables to the rescue! As you may recall from part 1, you can reuse variables anywhere in QGIS where you can enter an expression. This includes using them for data defined overrides in symbology and labelling. So, lets imagine that way back at the beginning of our project we created a project level variable called @main_label_font:

Creating a variable for label font

Creating a variable for label font

Now, we can re-use that variable in a data defined override for the label font setting. In fact, QGIS makes this even easier for you by showing a “variables” sub-menu allowing easy access to all the currently defined variables accessible to the layer:

Binding the label font to the @main_label_font variable

Binding the label font to the @main_label_font variable

 

When we hit Apply all our labels will be updated to use the font face defined by the @main_label_font variable, so in this case ‘Courier New’:

courier_new

In a similar way we can bind all the other layer’s label fonts to the same variable, so @main_label_font will be reused by all the layers in the project. Then, when we later realise that Courier New was a horrible choice for labelling the map, it’s just a matter of opening up the Project Properties dialog and updating the value of the @main_label_font variable:

delicious

And now when we hit Apply the font for all our labelled layers will be updated all at once:

new_labels

It’s not only a huge time saver, it also makes changes like this easier because you can try out different font faces by updating the variable and hitting apply and seeing the effect that the changes have all at once. Updating multiple layers manually tends to have the consequence that you forget what the map looked like before you started making the change, making direct comparisons harder.

Of course, you could have multiple variables for other fonts used by your project too, eg @secondary_label_font and @highlighted_feature_font. Plus, this approach isn’t limited to just setting the label font. You could utilise project level variables for consolidating font sizes, symbol line thickness, marker rotation, in fact, ANYTHING that has one of those handy little data defined override buttons next to it:

See all those nice little yellow buttons? All those controls can be bound to variables...

See all those nice little yellow buttons? All those controls can be bound to variables…

One last thing before I wrap up part 2 of this series. The same underlying changes which introduced variables to QGIS also allows us to begin introducing a whole stack of new, useful functions to the expression engine. One of these which also helps with project management is the new project_color function. Just like how we can use project level variables throughout a project, project_color lets you reuse a color throughout your project. First, you need to create a named colour in the Default Styles group under the Project Properties dialog:

Define a colour in the project's colour scheme...

Define a colour in the project’s colour scheme…

Then, you can set a data defined override for a symbol or label colour to the expression “project_color(‘red alert!’)“:

bind_color

When you go back and change the corresponding colour in the Project Properties dialog, every symbol bound to this colour will also be updated!

blue_alert

So, there you have it. With a little bit of forward planning and by taking advantage of the power of expression variables in QGIS 2.12 you can help make your mapping projects much easier to manage and update!

That’s all for now, but we’re still only just getting started with variables. Part 3, coming soon!.. (Update: Part 3 is available now)

 

Tagged , , ,