So, if I understand the data, it is as follows:
The first thing to note is that this is only one hour of data so there will be some variation, of course. The benefit of these S2 devices is that they run continuously and therefore slight under/over counts in any one time period will balance out to give a good indication of general trends if not absolute numbers. Minor absolute count differences will be amplified as percentages over a small sample size.
For the individual categories:
Pedestrians: we know that these, and two-wheelers, can be less accurate for the very reason you pointed out. Pedestrians often walk side by side with companions, and as such it may be difficult for the camera to separate them out. We intentionally use a lower resolution camera for privacy and cost, but the trade-off is that two people side by side may look like just one. This would affect your numbers greatly when only talking about one hour if in that hour you had seven “groups” of pedestrians.
Scooters are generally interpreted as pedestrians at this time as they do not “look” like a bike, so if they are recognised at all, they will probably be counted as pedestrians.
The system will also reject any ‘object’ that does not (mainly) travel from one side to the other of the field of view. So any individuals who enter houses, or climb into cars, will not be counted.
Taking this into account could explain many of the differences between the numbers you reported. Of course, we may still miss some pedestrians, and this also goes for those where we see only head and shoulders over cars or hedges, etc., some of which may get recognised, and others not.
Two-wheelers: It would be unusual for a two wheeler to be recognised as a car, but not impossible (depending on how long the system has to recognised something moving). This can sometimes be the case if there are strong shadows in the view during the day as the contrast can create false ‘bulk’ to an image.
Again, we may not see & recognise every single bike, so there may be an undercount, and the same applies here (though it may not be your case) for bikes riding side by side.
Cars & Heavy Vehicles: The differentiation here is a little arbitrary. As you mention, the vans were small, so it may be a question of some being tagged as cars, others as vans. It is hard, even as a human observer, to decide whether something is a car, a van or a truck, so some variation could occur.
The total of these two categories being higher than the manual count is unusual, and as you say, there could have been some confusion over a bike/motorbike or two, but since the numbers are so small, any very minor issue will end up being a large percentage.
The real value of the data is in the trends and observed changes, rather than just absolute numbers. Whatever issues there might be with the data collection, having thousands of data points (every 15 minutes, every day, for a year) collected using a consistent tool, gives a very good indication of the real activity levels on that street even if they are not 100% accurate in terms of numbers of individual modes.
If you are particularly concerned about the local conditions, the best thing would be to record a video with the same field of view (including trees etc.) as the device and we can try running it through our training system to see if we spot any systematic errors.