Well, there are certainly some basic steps that can be taken to help identify and prevent bot usage of the key identity management services many public facing API's and applications expose. Firstly, let's describe some of the main functional areas bots are likely to attack.
The Identity Attack Vector
Any public facing service or API will expose several identity related endpoints. If you think of the full identity life cycle, take the following as a basic list of expected services: account signup; progressive profiling; social signup; profile management; device registration/device pairing; forgotten password/username; sign in, MFA sign in, MFA enrolment and probably account deletion/RTBF if the service provider is being considerate.
Account Sign-up, then Clean-up
Without an account, you can't access a service. So it seems likely this would be first entry point into the application a bot would look to test and use. A few noddy steps here could help. Firstly, leverage a CAPTCHA system. The lovely long acronym (standing for Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a simple way of adding a little barrier to a flow to reduce automation. For the geeks, it's actually a reverse Turing test, but we won't go there just now. Google provide their reCAPTCHA integration pretty simply, but there are numerous others that are available. Certainly a CAPTCHA step would be early on in the signup flow. What else? Well, clearly some sort of input validation would be useful during signup. So, perhaps client side libraries to perform some sort of syntax checking for things like email address, username and so on. If exposing API's, a simple server side validation engine would be needed here.
So let's skip for ward a little. At account sign-in time, there are several steps that should be considered. We know multi-factor-authentication is omnipresent and also perhaps coming to the end of its useful life - with more modern and flexible fine grained approaches to authentication being needed.
Throttling, Analytics & Machine Learning
Another big risk of bot infestation, is DDoS. So a basic stopper can be throttling. Applying limits to the number of times a device calls a particular endpoint seems a no-brainer. The throttling is generally tied to things like the servlet session Id or perhaps IP address. Whilst none of these things are insurmountable, they all add to the security in depth approach.
Machine Learning seems to be the flavour of the month when it comes to cyber security in general. Whilst it seems relatively early days with respect to ML or AI best practices on this topic, being able to easily leverage as-a-service machine learning platforms such as AWS's MXNet, that could easily be configured to consume activity and log data collected via the identity lifecycle, it would seem a nice weapon in the bot fighting arsenal.