Amazon's voice-based assistant, Alexa, enables users to directly interact with various web services through natural language dialogues. It provides developers with the option to create third-party applications (known as Skills) to run on top of Alexa. While such applications ease users' interaction with smart devices and bolster a number of additional services, they also raise security and privacy concerns due to the personal setting they operate in. This paper aims to perform a systematic analysis of the Alexa skill ecosystem.
We perform the first large- scale analysis of Alexa skills, obtained from seven different skill stores totaling to 90,194 unique skills. Our analysis reveals several limitations that exist in the current skill vetting process. We show that not only can a malicious user publish a skill under any arbitrary developer/company name, but she can also make backend code changes after approval to coax users into revealing unwanted information. We, next, formalize the different skill- squatting techniques and evaluate the efficacy of such techniques. We find that while certain approaches are more favorable than others, there is no substantial abuse of skill squatting in the real world. Lastly, we study the prevalence of privacy policies across different categories of skill, and more importantly the policy content of skills that use the Alexa permission model to access sensitive user data. We find that around 23.3 % of such skills do not fully disclose the data types associated with the permissions requested. We conclude by providing some suggestions for strengthening the overall ecosystem, and thereby enhance transparency for end-users.
Over the years, Amazon has made it easier for users to enable Alexa skills. When Amazon first introduced Alexa, users had to enable skills either through the app or through their online account. In 2016, it became possible to explicitly enable skills with a voice command, and since mid 2017, Alexa now automatically enables skills if the user utters the right invocation name, favoring native or first-party skills that are developed and maintained by Amazon. Amazon, however, does not prevent non-native skills from sharing the same invocation name. The actual criteria that Amazon uses to auto-enable a skill among several skills with the same invocation names is unknown to the public. We, therefore, attempt to infer if certain skill attributes are statistically correlated with how Amazon prioritizes skills with the same invocation name.
Finding: Due to the lack of transparency on how Amazon auto-enable skills with duplicate invocation names, users can easily activate the wrong skill. While there is a positive correlation between a skill being activated and the number of ratings it receives, it does not imply causation as the auto-enabled skill appears on users’ companion app and there by making it easier for users to provide ratings.
When a skill is published in the skill store, it also displays the developer’s name. We found that developers can register themselves with any company name when creating their developer’s account with Amazon. This makes it easy for an attacker to impersonate any well-known manufacturer or service provider. As Amazon displays the developer’s name on a skill page, users can be easily deceived to think that the skill has been developed by an authentic source when it has really been published by an attacker. This can help an adversary launch phishing attacks especially for skills that require account linking.
Finding: An attacker can getaways with publishing skills using well-known company names. This primarily happens because Amazon currently does not employ any automated approach to detect infringements for the use of third-party trademarks, and depends on manual vetting to catch such malevolent attempts which are prone to human error. As a result users might become exposed to phishing attacks launched by an attacker.
Amazon sets requirements for hosting code in a backend server that governs the logic of a skill. However, these requirements involve ensuring the backend server responds to only requests signed by Amazon. During the verification process, Amazon sends requests from multiple vantage points to check whether the server is responding to unsigned requests. However, no restriction is imposed on changing the backend code, which can change anytime after the certification process. Currently, there is no check on whether the actual responses (logic) from the server has changed over time. Alexa, blindly converts the response into speech for the end-user. This can enable an attacker to craftily change the response within the server without being detected. While this may sound benign at first, it can potentially be exploited by an adversary who intentionally changes the responses to trigger dormant, registered intents to collect sensitive data (e.g., phone number).
Finding: An attacker can register any number of intents during the certificate process, whether or not all intents are used. Note that Amazon first parses human speech to identify data (e.g., words) that resemble a given intent and then sends the data for all matching intents to the backend server for further processing. There is no restriction as to how many intents a skill can register, only that matching intents will be triggered. Thus, an attacker can register dormant intents which are never triggered during the certification process to evade being flagged as suspicious. However, after the certification process the attacker can change the backend code (e.g., change the dialogue to request a specific information) to trigger dormant intents to collect sensitive user data.
Alexa skills can be configured to request permissions to access personal information, such as the user’s address or contact information, from the Alexa account. Similar to permissions on smartphones, users enabling these skills must grant permission upon activation. These permissions can make interaction with a skill much more convenient, e.g., a weather skill with access to device address can report relevant weather forecasts based on the user’s location. Permissions allow access to the following data types: device address, customer name, customer email address, customer phone number, lists read/write, Amazon Pay, reminders, location services and skills personalization. However, we found instances where skills bypass these permission APIs and directly request such information from end-users. After manually vetting the candidates we found a total of 358 unique skills potentially requesting information that is protected by a permission API.
Finding: Alexa does not properly mediate the intent of sensitive data types. An adversary can directly request data types that are structured to be protected by permission APIs. Even when the attacker uses the built-in data type, like Amazon.Phone for an intent, the skill does not get flagged for requesting sensitive data.
While we found four common approaches for squatting an existing skill, we did not find any systematic malicious abuse of skill squatting in the wild. The non-evidence of malicious skill-squatting is a valuable data-point for the research community, as previous works have focused on showcasing how skills can be squatted without validating the prevalence and impact in the real world. However, it should be noted that the cause of non-detection could have been due to mitigation strategies enacted by Amazon, which may have been influenced by prior work.
Finding: Certain approaches within each skill-squatting pattern have a higher likelihood of successfully squatting skills. For the different spelling types and homophones, we saw that correct/accepted spelling increased the likelihood of launching the expected skill over its variants with additional or altered letters. However, for punctuation appropriate usage reduced its chance of being activated. And for word-spacing, joint words succeeded most of the time.
Amazon enables skill developers to provide a privacy policy link addressing how data from end-users is collected and used. However, Amazon does not mandate a privacy policy for all skills, rather only for skills that request access to one or more of their permission APIs. We, therefore, analyze the availability of privacy policy links in the US skill store. We found that around 28.5% of the US skills provide a privacy policy link.
Finding: For certain categories like ‘kids’ and ‘health and fitness’ only 13.6% and 42.2% skills have a privacy policy, respectively. As privacy advocates we feel both ‘kids’ and ‘health’ related skills should be held to higher standards with respect to data privacy. The FTC is also closely observing skills in the ‘kids’ category for potential COPPA violations.
Skills by default are not required to have any accompanying privacy policies. However, any skill requesting one or more permissions must have an accompanying privacy policy for it to be officially available in the skill store. Users enabling these skills must grant permission to these APIs upon activation. These permissions can make interaction with a skill much richer, e.g., a weather app with access to device address would know which location’s weather to report when asked. It is however, unclear if the privacy polices properly address (i.e., explicitly state the data collection and share practices) the permissions requested.
Finding: For skills requesting access to sensitive data protected by the permission APIs, around 23.3% of their privacy policies do not fully disclose the data types associated with permissions requested.
For a set of 16 skills requesting the Postal Code and Device Address permissions (e.g.,B072KL1S3G, B074PZQTXG, B07GKZ43J5), we found similarly potentially deceptive statements within the privacy policy (“We never collect or share personal data with our skills”)
Our research will be presented at the Network and Distributed System Security Symposium (NDSS) in February 2021.
We thank our anonymous reviewers for their feedback. This material is based upon work supported in parts by the National Science Foundation under grant number CNS-1849997 and the state of North Rhine-Westphalia. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Christopher Lentzsch, Sheel Jayesh Shah, Benjamin Andow, Martin Degeling, Anupam Das, and William Enck. Hey Alexa, is this Skill Safe?: Taking a Closer Look at the Alexa Skill Ecosystem. In Proceedings of the 28th ISOC Annual Network and Distributed Systems Symposium (NDSS), 2021.
@inproceedings{ alexa-skill-ecosystem-2021, author = {Christopher Lentzsch and Sheel Jayesh Shah and Benjamin Andow and Martin Degeling and Anupam Das and William Enck}, title = {Hey {Alexa}, is this Skill Safe?: Taking a Closer Look at the {Alexa} Skill Ecosystem}, booktitle = {Proceedings of the 28th ISOC Annual Network and Distributed Systems Symposium (NDSS)}, year = 2021 }
Christopher Lentzsch (Ruhr-Universität Bochum) christopher.lentzsch @ ruhr-uni-bochum.de Sheel Jayesh Shah (North Carolina State University) sshah28 @ ncsu.edu Benjamin Andow (Google) andow @ google.com Martin Degeling (Ruhr-Universität Bochum) martin.degeling @ ruhr-uni-bochum.de Anupam Das (North Carolina State University) anupam.das @ ncsu.edu William Enck (North Carolina State University) whenck @ ncsu.edu