Improving NCSD Data Quality - call for input

The Operator Digital Initiative (OpDI), Confederation for Passenger Transport (CPT) and Traveline have kicked off a project this week to improve NCSD, the National Coach Service Database. This will include as a minimum assessing data completeness and reaching out to coach operators to incentivise good data publishing practice.

All comments and suggestions around NCSD would be very welcome from this forum

I am happy to add ideas for this discussion.

I agree that NCSD needs to improve and needs to be much bigger than it is now as there are so many missing services.

Something that I think needs to be looked at though, and as daft as it sounds, what constitutes a coach service. Where is the line for what should/shouldn’t be included.

Purely for examples, there are a number of local services which are operated by coaches (T1C in Wales, Red Arrow Nottingham, X6 Leicester, Greenline 757). As well as a number of coach services which have a local bus element (Berrys Superfast is long distance express but carries local passengers at local stops).
Do event services count? This was something I flagged up on KPMGs Discord a while back and there was little budge on it as National Express wanted events in the system.
Is Northern Ireland included?
International? etc

I’m not trying to be difficult, more just think if the database is going to be improved, there needs to be a clearer criteria.

Then how can operators be encouraged to publish if they don’t currently create any kind of open data and don’t currently seem to respond to anyone who enquires about them providing data (Ie Thandi Express, Thandi Red, Hannon, Flibco). I think the only way it was done previously was a lot of manual entry which no one seems to want to do.

1 Like

There are many missing services in the ATCO.CIF version, culminating in a recent message to BODS helpdesk:

I’d like to raise an issue with the BODS ATCO CIF NCSD. The latest file at https://coach.bus-data.dft.gov.uk/ATCO-CIF.zip only contains two lines, namely:

ATCO-CIF0510Bus Open Data BODs ATCO-CIF 20250718094227
QVNCOACH COACH

Could this please be looked at as the websites of National PTI, Transport for London, Transport for West Midlands and Merseytravel are currently lacking the entire NCSD due to the issue above.

Not great. Additionally, the lack of any phrasing wired in that ā€œthese services must be pre-bookedā€ and website/helpline contact details to do so is an omission and a downgrade on the Basemap previous version.

I also agree that registered local services must be de-duplicated from the BODS NCSD as these must, statutorily, be in the BODS local bus service data.

3 Likes

The timeliness of responses is shocking, @iMarkeh raised an issue on the Discord forum, that took 16 days to come back on, even then the issues remained unfixed.

1 Like

And also need to mention the poor or very incomplete route header information, especially for Flixbus services. Belgravia - Belgravia is common for London bound services. Via where - Newcastle, Cardiff, Plymouth, Liverpool?

1 Like

Flixbus data is an absolute mess. I don’t think it’s all due to KPMG though I don’t think as a number of issues are also prominent in the Flix GTFS feed.

  • Ie, all trips are outbound only, there are no ā€˜inbound’ trips
  • Some routes have poor descriptions which don’t relate to where they go (ie UK003 claims to run to Halifax but Flix don’t stop in Halifax).
  • Lack of pick up/set down stop restrictions (I know some stops are difficult to make this work but National Express manages it by doing each trip separately to kind of ban local journeys being offered. Or alternatively, any stop which is always pick up or drop off only should be marked as such (Ie Hammersmith is always drop off only. Truro Bus Station is drop off only towards Falmouth Uni but pick up only towards London etc etc)
  • Loads of stops are on the wrong side of the road or in the wrong places because in some places Flixbus isn’t differentiating by side of the road (for example even on Flix GTFS data, Middlesbroughs night stop is not reflected anywhere despite being 150+ metres away from the daytime stop. University of Nottingham some trips towards London stop on one side of the road, other trips towards London use the other side of the road.

I could go on, but you get the idea. Sadly I’ve found no one at Flixbus yet who is willing to listen to get these data issues sorted in the GTFS feed (which in the end would make great improvements in the accuracy of NCSD)

In relation to your de-duplication point, to throw some curveballs in there, National Express has a number of routes which are part registered, part not. To completely deduplicate would mean half of a service is in NCSD, half in BODS.

Hi

Thanks for the details, yes, ā€œsomething needs to be doneā€.

As a general rule on registered local services v coach services, in our data (mostly SW England-based) we try to place services with registered elements (eg SW Falcon, Berrys Coaches Superfast) in bus data and reserve coach data for ā€œpureā€ coach services,but accept that this is difficult to be entirely consistent where National Express routes are concerned, as end users would clearly expect these to be defined as coaches.

Happy to introduce you to some contacts at Flixbus if you have had no luck?

1 Like

This is all really helpful, thanks for all the advice so far. OpDI, CPT and Traveline will pick these FlixBus issues up via CPT’s contact at Flix, who is already engaged. That’s not to say that you shouldn’t go ahead and do you own thing too!

Some here may remember that the NCSD used to be collated direct from coach operators but also from each regional data supplier to Traveline, which used to send ā€˜express bus’ services to DfT to form part of the NCSD.
Some further thoughts:

  • It wouldn’t be impossible to identify the express bus services and create an extra dataset on either BODS or TNDS for this type of service.
  • Most journey planners that we’ve come across recognise duplicated data and only show one version to the customer but…
  • Some data consumers may only take coach and not BODS bus data