How We Recreated Amazon Go in 36 Hours: A Full-Stack Developer‘s Perspective

In January 2018, Amazon opened its first Amazon Go store to the public in Seattle. The 1,800 square foot convenience store has no registers or cashiers. Instead, it relies on a complex network of cameras, sensors, and artificial intelligence to automatically detect what items customers take from the shelves, allowing them to simply walk out without having to check out.

The launch of Amazon Go sparked widespread fascination and speculation about the future of retail automation. Many wondered how exactly the "just walk out" technology worked behind the scenes and whether it could be replicated.

Intrigued by the technical challenge, my colleagues and I decided to find out by attempting to build our own version of Amazon Go in just 36 hours at a hackathon. Our goal was to create a working prototype that demonstrated the core functionality – a store where customers could take items off the shelf and leave without any manual checkout process.

Planning and System Design

Our team consisted of four engineers, each specializing in a different area:

  • John, an iOS developer
  • Ruslan, a backend developer with machine learning experience
  • Soheil, an embedded systems engineer
  • Me, an Android developer

To start, we broke down the problem and defined the key components that would be required to achieve the desired functionality:

  1. Customer identification: Ability to uniquely identify customers entering and exiting the store
  2. Inventory tracking: Real-time tracking of product inventory on the shelves
  3. Item detection: Detecting when a specific customer takes a specific product off the shelf
  4. Virtual cart management: Maintaining a virtual cart of items for each customer in the store
  5. Seamless entry/exit: Customers should be able to enter and leave with no manual action required (i.e. no turnstiles or doors)
  6. Instant receipts: Customers should be billed automatically upon exit for the items they took

With these requirements in mind, we drafted a high-level system architecture:

[Insert detailed architecture diagram showing all components and data flow]

The architecture consisted of four main components:

  1. Facial recognition system to identify customers based on pre-enrolled facial profiles
  2. Real-time database to maintain customer profiles, virtual shopping carts, and product inventory
  3. Smart shelf sensors to detect item removal and link it to a customer
  4. Mobile apps for store management and customer shopping experience

Let‘s dive into each of these components in more detail.

Facial Recognition with Kairos

For customer identification, we chose to use facial recognition as it would allow for a seamless entry and exit experience without requiring any manual action from shoppers.

We integrated the Kairos facial recognition API which allowed us to easily enroll and identify faces. New customers would have their face enrolled via a registration process on the store manager app. Subsequently, whenever a customer entered or exited the store, we could verify their identity against the database of enrolled faces.

Using Kairos, the facial recognition flow worked as follows:

  1. Capture an image of the customer‘s face
  2. Send the image to the Kairos API /verify endpoint
  3. Receive a response with the matched face_id and confidence level
  4. Use the face_id to look up the corresponding customer in our database

One limitation of this approach is that Kairos, like most facial recognition services, requires customers to be enrolled beforehand. This means it wouldn‘t work for identifying completely new shoppers who are not in the database.

Additionally, relying on a third-party cloud service introduced some challenges. API calls to Kairos were not always instantaneous which occasionally caused slight delays in the entrance/exit experience. And intermittent API errors would cause customer identification to fail.

For a production system, it would likely be better to use an on-device facial recognition model to improve speed and reliability. However, given our time constraints, using the Kairos API was the most practical choice for the hackathon.

Real-Time Inventory and Shopping Carts with Firebase

Another critical component of the system was the real-time database to keep track of product inventory and customer shopping carts. We chose Firebase‘s Cloud Firestore database for this purpose as it provided an easy way to synchronize data across multiple devices in real-time.

The database schema consisted of two main collections – items and users.

The items collection stored the details and current inventory count for each product:

{
  "items": [
    {
      "item_id": "abc123",
      "name": "Raspberry Pi 3",
      "price": 35.00,
      "inventory": 12
    },
    ...
  ]
}

The users collection contained a document for each registered customer, including their virtual shopping cart and current in_store status:

{
  "users": [
    {
      "user_id": "xyz456",
      "name": "John Smith",
      "face_id": "kairos_id_123",
      "in_store": true,
      "cart": [
        {
          "item_id": "abc123",
          "quantity": 1
        },
        ...
      ]
    },
    ...
  ]
}

Whenever a customer entered the store, a background process would continuously monitor the database for changes to that customer‘s in_store status. As soon as it flipped to true, the facial recognition system would be activated to identify the customer.

Similarly, changes to a product‘s inventory count would be synced in real-time to the manager app, allowing store employees to easily monitor stock levels.

When an item was detected as taken from the shelf, the system would identify the customer, decrement the item‘s inventory count, and add the item to the customer‘s virtual cart. All of these database operations happened instantly without needing page refreshes or manual syncing.

Some other notable aspects of the database design:

  • Each user had an associated face_id that mapped to their Kairos facial recognition ID, allowing us to link a face to a database entry
  • The in_store boolean made it simple to filter the list of customers currently shopping (e.g. to show on the manager dashboard)
  • Storing items in a customer‘s cart as a subcollection provided flexibility to modify cart quantities and made querying and updating straightforward

One thing to note is that using Firebase required all devices to be online and connected in order to function. In a real deployment, it may be better to maintain a local cache of the database on each device to allow for offline operation and better resilience to network issues.

IoT Smart Shelf with Raspberry Pi

The smart shelf was perhaps the most technically challenging component of the system. In order to automatically detect when items were removed by customers, we needed a way to instrument the physical shelves with sensors.

Due to budget and time constraints, we weren‘t able to use weight sensors for each individual item cubby as Amazon Go does. Instead, we came up with a simple solution using ultrasonic distance sensors.

We attached an HC-SR04 ultrasonic sensor to each shelf, pointed at the items. The sensor continuously measured the distance to the nearest object and sent this data to a Raspberry Pi via GPIO pins.

On the software side, a Python script on the Pi monitored the distance readings from each sensor. If the distance suddenly increased (i.e. an item was removed), it would trigger the item removal callback.

At this point, a signal would be sent to an overhead camera to capture an image of the customer. This image was then sent to the facial recognition system to identify who took the item. Finally, the item was added to the customer‘s virtual cart in the Firebase database and the shelf inventory decremented.

The flow looked something like this:

  1. Ultrasonic sensor detects item removal
  2. Raspberry Pi triggers camera to capture image of customer
  3. Image is sent to facial recognition system to identify customer
  4. Customer‘s virtual cart is updated in Firebase to include the item
  5. Item inventory is decremented in Firebase
[Insert photo of smart shelf with ultrasonic sensors]

One drawback of the ultrasonic sensor approach is that it can only detect that an item was removed, but not precisely which item. We got around this by only placing one product type on each instrumented shelf, but this wouldn‘t scale well to a real store with thousands of products.

Additionally, the HC-SR04 sensors had a narrow sensing angle and occasionally missed item removal events if a customer reached in from the side. Using multiple sensors per shelf and fusing the sensor data could help mitigate this.

Despite the limitations, the smart shelf prototype worked well for the small scale demo and provided a good proof-of-concept. In a real world implementation, more sophisticated techniques like weight sensing, LIDAR, or RFID tagging could be used to make the item tracking more precise and robust.

Manager and Customer Mobile Apps

To tie everything together and provide a user-friendly interface for interacting with the system, we built two mobile apps – a store management iPad app for employees and an iOS app for customers.

The manager app served three main functions:

  1. Registering new customers by capturing their photo and creating a facial recognition profile
  2. Monitoring the list of customers currently in the store
  3. Tracking product inventory levels in real-time
[Insert screenshot of manager app interface]

The customer app allowed shoppers to create a profile, log in via facial recognition, view their current cart, and receive an emailed receipt upon exiting.

[Insert screenshot of customer app interface]

On the backend, both apps interacted directly with the Firebase database to read and write customer and product information. Firebase‘s Cloud Functions platform provided a convenient way to trigger backend processes like sending customer receipts based on database events.

System Scalability Analysis

Looking beyond the prototype, it‘s important to consider how well the system architecture would scale to support a larger deployment across multiple stores and thousands of customers.

On the hardware side, the current prototype relies on a single Raspberry Pi and a small number of ultrasonic sensors, which is not a scalable setup. Instrumenting hundreds of shelves across a full grocery store would require a much more distributed architecture, with multiple edge computing nodes processing sensor data in parallel. The sensing technique would also need to be made more precise to uniquely identify items.

From a software perspective, the Firebase database and facial recognition API held up quite well under the load of ~100 customers during our testing. Firebase‘s automatic sharding and strong consistency guarantees make it well-suited to a high volume of write traffic that needs to be synced across many devices.

However, there could be some concerns around data privacy and security with storing sensitive customer information in a third-party cloud database, especially given the recent data leaks from major tech companies. Depending on the regulatory environment, it may be necessary to use an on-premises or hybrid cloud solution with stricter access controls.

There are also open questions around how the facial recognition system would perform with a large number of enrolled faces across a diverse customer base. False positive identifications could lead to customers getting charged for items they didn‘t take. Bias and inaccuracies in facial recognition models, particularly for people of color, are well-documented issues that would need to be rigorously tested for.

For these reasons, a real-world deployment of an automated checkout system like this would need significant additional development to make it truly robust and scalable. But the core architecture of connected IoT sensors, real-time data streaming, edge computing, and computer vision provides a solid foundation to build upon.

Future of Automated Retail

The launch of Amazon Go in 2018 was a watershed moment for the retail industry, sparking widespread interest and investment in automated checkout technology. Since then, a number of other companies including Standard Cognition, Zippin, and Trigo have launched similar computer vision-powered checkout systems.

As the technology matures, it‘s likely that we‘ll see autonomous checkout move from small convenience stores to larger grocery and retail environments. In addition to eliminating friction at checkout, it will enable retailers to collect fine-grained data about in-store shopping behavior to inform everything from inventory planning to store layout optimization.

At the same time, widespread adoption of this technology raises some potential societal implications that warrant further discussion, such as:

  • Job displacement for retail workers: Although Amazon has claimed its Go stores have not led to cashier job losses, it‘s not hard to imagine a future where most retail transactions are automated, reducing the need for human workers. Policies and programs to retrain displaced workers may become increasingly important.
  • Privacy concerns with ubiquitous surveillance: Relying on cameras and facial recognition to track shoppers‘ every move raises valid privacy questions. Clear communication and consent around data practices will be essential for maintaining customer trust. Retailers may also need to implement measures to anonymize and restrict access to collected data.
  • Potential for technical failures: If this technology becomes widely deployed, we may become overly reliant on it functioning properly for stores to operate smoothly. Safeguards would need to be put in place to handle situations gracefully if the system crashes or edge devices go offline.

Despite these challenges, the potential benefits of automated retail in terms of convenience and efficiency are hard to ignore. As a technologist, I believe we have a responsibility to continue pushing the boundaries of what‘s possible while thoughtfully considering the ramifications.

Building our makeshift Amazon Go clone in 36 hours was an exhilarating experience that opened my eyes to how quickly breakthrough innovations can be prototyped by small, scrappy teams with today‘s tools. I‘m excited to see how this technology evolves in the coming years and look forward to further experimentation in the retail automation space.

If you have any questions or want to chat more about the future of autonomous checkout, feel free to get in touch!

Similar Posts