A Contrarian View of Deep Learning

I like to end my work weeks by reading an academic paper on AI, scintillating I know. 🙂 However, I find that when a subject is changing quickly or if there is a lot of hype, it is often best to go to the source material to know exactly what is going on. With computer science we have the added advantage of being able to get the corresponding code from GitHub to try it out. In business you have to spend extra time scrutinizing the SEC filings to normalize the financials for comparison. I have been impressed with the results from deep learning and have been excited about what should be possible in the future. Since I feel this way, it is time for me to seek out an alternative perspective. So I recently read “Deep Learning: A Critical Appraisal” by Gary Marcus who is a professor in the Department of Psychology and Neural Science at NYU. It is an interesting paper and was controversial when it was published. I will give a brief review of the paper and a few of my thoughts.


Image source: Wikimedia

To get everyone on the same page, deep learning is a statistical model for classifying data and is at the core of many of the AI advances we enjoy today. Labeled data is used to train a model and when a new piece of information is presented it classifies it based on data that it was previously exposed to. 

AI Challenges

Even with the great many applications that this technique has there are 10 challenges Gary Marcus calls out:

  • Deep learning is data hungry and requires too much data to make basic classifications that humans can often do with a single example.
  • Despite its name, deep learning is only deep architecturally and not from a knowledge perspective with limited transferring of insights.
  • There is no natural way to deal with hierarchical structure. Changing the order of words leaving a sentence open to interpretation can be difficult when the generalized set differs widely from the training set.
  • Deep learning struggles with open-ended inference. It should be noted that this is a problem that the industry is trying to address through new AI contests like SQuAD v2.
  • People are concerned with the ethics of AI and the lack of transparency does not help. As AI becomes more central to a lot of decision making it will be critical that the models can be audited for the right level of accountability.
  • Prior knowledge has not been well integrated into deep learning models. Of the objections this is one that I personally enjoyed. Models are interpretations based on previously seen data but there are some bodies of knowledge like physics that have formulas that can be used explicitly on new examples without requiring interpretation.
  • Causation and correlation are not inherently differentiated with deep learning.
  • The world is largely assumed to be stable and it is not. That is why today AI is great at playing games but is still difficult to use in open-ended problem spaces.
  • Deep learning is great at approximation but cannot be fully trusted. This reminds me of using language translation where the results are really good for establishing understanding but at this time I would still want a human if I was dealing with a contract.
  • It is difficult to engineer deep learning. Although true, there is a lot to disagree with. The frameworks (e.g. PyTorch, TensorFlow, Keras, ONNX, etc.) are making it easier to have common building blocks that takes some of the difficulty out of development including supporting popular neural network structures including new ones like generative adversarial networks (GANs).

The paper highlights ten challenges but also includes 4 ways to make it better that I will expand upon here.

Paper’s Recommendations for AI

Unsupervised Learning

Unsupervised learning is a theoretically better approach for learning which is something that I agree with. I wrote a news reader that I use to provide a conservative and liberal perspectives to news stories. Problems with labeling data in sufficient quantity to make this work is tough. Instead I used document clustering so that I would not have to rely on labeled data. This strategy worked really well in providing me a more balanced news reader. With that said unsupervised learning still has further to go if it is going to surpass supervised learning.

Symbol-manipulation, and hybrid models

Hybrid models can be used in addressing the prior knowledge problem that was raised as part of the challenges. By encoding math, physics and other codified knowledge into a hybrid model there could be definitive knowledge using best practices for getting answers and then use the deep learning process for classifying information when there is not a symbol-manipulation method to do it. Note that in the critiques of this paper this was an area aggressively pushed back on since the results historically haven’t been as good as deep learning. Although true, there was a time when it was said that deep learning was a failure and now it is the dominant AI approach so it might not be right to completely dismiss symbol-manipulation.

More insights from cognitive development

Ethics is something that is top of mind for people dealing with AI and trying to make sure that it has a positive impact on society. Part of the way to deal with these issues is to not just rely on mathematics but to also include insights from human psychology. In principle I really like this concept but I don’t know enough about psychology to determine how much of it can be integrated into models. There is also the notion of common sense being integrated into the models. However, looking at the many issues being publicly discussed it is clear that there would be a lot of problems in determining what exactly is “common” in our collective sense. Math and science would be a lot easier to research and determine than to add human psychology into the mix but this would be an important development.

Bolder challenges

AI contests have done a great job of rallying industry and academia to take on new challenges. However, if the goal is arguably to reach artificial general intelligence, then there needs to be bolder challenges that expand beyond confined closed-end problems. There is some signs that this is becoming a thing accepted in the AI community. For example, the Stanford Question and Answer Dataset (SQuAD) originally started by finding answers in a passage but in the second version this has been updated to not only find answers in a passage but also to determine if something is not in the passage. This is a model that should be replicated in a lot of the tests that are out there.

Conclusion

Overall this was a good read with a lot of thought provoking questions. There is a lot of great AI advances with more progress to come. Advances in computing from processors (CPUs to GPUs to FPGAs) will only do so much but advances in quantum computing should provide a fundamental step function in making the existing deep learning process more effective. However, even with that there will most likely need to be a new approach to getting to the next level like what happened in the past from the transition to neural networks from expert systems. Geoff Hinton mentioned this as well with “science progresses one funeral at a time. The future depends on some graduate student who is deeply suspicious of everything I have said.” 

Talk to you later,

Orville (@orville_m)

Advertisements

A History of Amazon’s Shareholder Letters

This weekend there were a lot of great tweets and articles about Jeff Bezos’ annual shareholder letter about high standards and the value of writing. Although there was a lot of great conversations about it, there was an ironic miss in that Jeff talks about deep thoughts translated into written word, yet most analyses came out within a day and lacked deeper perspective. That got me thinking about binge reading all of the shareholder letters starting from the beginning to see how the Amazon story has evolved from Jeff’s perspective. My notes here are things that jumped out at me with the benefit of hindsight and I will not pretend that this going to be a great piece of writing, although I still welcome your feedback. Before we get started I should call out that I don’t work at Amazon but I do own a few shares and too many Prime boxes show up at my house regularly.

Brief summaries of all of the letters is still a lot of reading so here is a TLDR version with the full summaries afterwards. The letters follow multiyear themes that I categorized as:

  • Early growth years (1997 – 2000)
  • Operational years (2001 – 2002)
  • Being a shareholder (2003 – 2005)
  • New seeds of growth (2006 – 2007)
  • Recession part 2 (2008 – 2009)
  • Amazon as a Service (2010 – 2011)
  • Customer obsession (2012 – 2013)
  • Amazon 2.0 (2014 – 2015)
  • Culture of success (2016 – 2017)

Regardless of the business environment he stayed focused on essentially the same prioritized themes for two decades but made refinements as the technologies and business environment changed.

So if you are interested in longer form content, here is a brief tour of Amazon through the shareholder letter.

Early Growth Years

1997 Letter to Shareholders (Link)

This is a popular letter so I won’t spend a lot of time summarizing it. Key things to note is that he lays out the core principles around customer obsession, Day 1, prioritizing cash flows, big bold bets, transparency in decision making and long-term thinking.

1998 Letter to Shareholders (Link)

Amazon is still a small company that is growing rapidly and is learning how to seize the internet opportunity. With a $1 billion run rate, 6.2 million customers, 200,000+ shareholders and an almost 4 fold increase in employees to 2,100 it is a delicate time for a startup. Here Jeff explains how to hire the right kind of people for long-term growth.

1999 Letter to Shareholders (Link)

Jeff explains what a shareholder gets by owning Amazon shares and the 6 goals for the future. What I like about this letter is that after 4.5 years Amazon has built the underlying infrastructure that will be responsible for future growth. In ending he touches on a future where increased bandwidth will make it possible to improve the shopping experience for people at home and an increase in non-PC devices accessing the internet wirelessly.

2000 Letter to Shareholders (Link)

Like all tech companies at the time, Amazon’s stock is getting rocked in the capital markets and is down 80% since the previous year. What is most telling about this letter is the distinct difference between the performance of the company and its publicly traded stock. He makes it clear that there won’t be any medium sized internet companies because of the benefits of scale. In his justification on why e-commerce would survive he highlights how Moore’s Law will make it easier for Amazon to serve more customers while keeping costs fixed.

Some interesting points:

  • Amazon had 20 million customers up from 14 million the previous year.
  • They helped Toys R Us sell $125 million of toys in Q4 of 2000. As I purchased some stuff from the liquidation sale I wondered why Toys R Us didn’t use the early learning of this time to become the dominant toy seller online instead of giving the investors dividends.
  • Amazon got an 84 rating on the American Customer Satisfaction Index, the highest of any service company ever to this point.

Operational Years

2001 Letter to Shareholders (Link)

Customer obsession and long-term shareholder value is the focus of this letter. What I like most about this letter is how he breaks down the economics of the business where he discusses 4 years of investing for growth followed by 2 years of cost reduction and now being at the point of reaccelerating growth while controlling costs. This is not an easy thing to do and most companies were just thinking about how to survive. That is when Jeff talks about the customer focused investments: launched “Look Inside the Book”, tools to prevent people from accidentally buying the same thing twice, and adding self-serve capabilities for customers. Shareholders are told that by keeping costs fixed while serving more customers they can expect more cash per share into the future. Overall this is a concise letter considering how much ground it covers.

2002 Letter to Shareholders (Link)

Trading real estate for technology was the great insight. Since Amazon is a virtual presence rather than a physical store they are able to do something that is a paradox for a physical store, offer a great customer experience and lower prices at the same time. The reason Amazon was able to do this is because they rethought what the customer experience should be like instead of just applying the physical retail experience to online. My favorite nugget is the following line that shows that the dream of immediate delivery with low prices was being thought of in 2002. “[Y]ou may find reasons to shop in the physical world—for instance, if you need something immediately—but, if you do so, you’ll be paying a premium. If you want to save money and time, you’ll do better by shopping at amazon.com.”

Being a Shareholder

2003 Letter to Shareholders (Link)

Running a business as a long-term owner requires enduring short-term issues for a longer-term benefit to the business. The example that Jeff Bezos provides that I like is around the Instant Order Update feature that flags if you purchased something already. It reduced sales by a statistically significant amount yet the positive impact for customers works out over the long-run. Again in this letter he calls out the ability to scale investment in features over increasingly larger groups of people provides a cost advantage. My feeling is that Jeff is educating shareholders that being a tech company has advantages. It is easy to believe that in 2018 but I am assuming it was an important message in 2003.

2004 Letter to Shareholders (Link)

2004’s letter is all about free cash flow (FCF). If you are not familiar with this term, the cash need to run the business (cash flow from operations) minus cash investment (cash flow for investments) equals FCF. Sometimes accounting earnings can be manipulated but cash tends to be pretty clear.

2005 Letter to Shareholders (Link)

Similar to 2004, this letter aims to explain how Amazon makes decisions. As is almost cliche in 2018 Amazon is a data driven organization but there is also humility in acknowledging that some decisions cannot be made with existing data and in those cases they will opt for what is best for the customer. A tidbit that I noticed is that in a point about how frequently their inventory turns over (14) it was  significantly lower than in a previous annual report where it was 19. I am not sure if this is due to holding more inventory or other factors but it caught my attention.

New Seeds of Growth

2006 Letter to Shareholders (Link)

In 2006 it is clear that Amazon is dominate when it comes to e-commerce so where will the future growth come from? Jeff discusses planting seeds for large differentiated ideas that will grow into the future billion dollars businesses and he mentions two. Fulfillment by Amazon where retailers can use web service APIs to manage inventory in Amazon fulfillment centers. Second is Amazon Web Services (AWS). Although in hindsight we know AWS will have a huge impact on how software is delivered, what I find most interesting is that this is the first time that Amazon the company is framed as a true platform for others.

2007 Letter to Shareholders (Link)

Continuing with the innovation theme Kindle is introduced. What I like about this post is the first principles approach to creating the Kindle. In recognizing that what people enjoy most about books is getting engrossed in the author’s writing and not the physical book, Amazon kept that part while adding improvements that are possible with a digital device. This letter is also the first time the term “cloud” is used along with the quotation marks.

Recession Part 2

2008 Letter to Shareholders (Link)

Jeff Bezos has clearly been battle tested from the dot-com fallout and seems bolder going into the great recession of 2008. Besides his themes of investing for the long-term, Kindle, AWS, cash flows and customer obsession; there are two things that are appropriate for the time. One is working backwards from the customer which leads to creating the right product/service regardless of the current capabilities of the organization. Creating the Kindle hardware is an example of this. The other is “muda” or corporate waste and how to keep cost under control which will be important in the coming years.

2009 Letter to Shareholders (Link)

During the first recession the shareholder letters focused on educating the reader on how Amazon operated. This letter returns to this approach. After discussing the great things Amazon has accomplished it goes into goal setting. What is interesting about the goal setting process is that it is very close to their values with 360 of 452 goals directly impacting customer experience with the words revenue and free cash flow only showing up 8 and 4 times respectively.

Amazon as a Service

2010 Letter to Shareholders (Link)

To date this letter has to be the nerdiest SEC filing I have read in my life and I have read filings from all of the major technology companies. It opens by stating “[r]andom forests, naïve Bayesian estimators, RESTful services, gossip protocols, eventual consistency, data sharding, anti-entropy, Byzantine quorum, erasure coding, vector clocks … walk into certain Amazon meetings, and you may momentarily think you’ve stumbled into a computer science lecture.” Machine learning and neural networks are also mentioned later on. Making this even odder is that there is no focus on customer obsession. My guess is that this letter is directly targeted at Wall Street letting them know that even with the recession Amazon will be investing heavily in technology bets that will directly impact free cash flows in the future.

2011 Letter to Shareholders (Link)

2011’s letter feels like the perfect sequel to tie up the loose ends of the 2010 letter. Amazon has transitioned from a platform to help customers to a platform that supports people to build businesses and express themselves. While describing AWS, Fulfillment by Amazon and the Kindle Direct Publishing platform, the emphasis is on being self-service because it allows innovation to happen resulting in more diversity of successful ideas. Also as a sign of the times there is a strong emphasis on how people are able to make a living using these platforms even as other job prospects were disappearing.

Customer Obsession

2012 Letter to Shareholders (Link)

Customer obsession returns Amazon focusing on doing things to help customers proactively. For example, how Prime keeps adding services even though there is no competitor pressure to do so. Key insight is that it is better to continuously increase benefits to customers and have them trust you than to wait for competitor pressure to do so.

2013 Letter to Shareholders (Link)

Employees become a larger part of the narrative including breaking ground for new buildings in Seattle for their headquarters which should help with attracting and retaining employees. Amazon is really starting to innovate at this stage and in this letter Jeff mentions that failing fast and iterating is the model they are taking.

Amazon 2.0

2014 Letter to Shareholders (Link)

Dreamy businesses have 4 characteristics: customers love it, they can grow to be large, have strong returns on capital and are durable. The identified businesses with these characteristics are: Marketplace, Prime, and AWS.

2015 Letter to Shareholders (Link)

Culture is the core theme of this letter. AWS reached $10 billion revenue faster than amazon.com and it might appear like they got there in different ways but it is actually similar. Amazon aims to have an inventive culture and with that includes a tolerance to risk. Also with the possibility of outsized returns taking as many chances as possible increases the odds of success. Inventiveness is not just limited to products in services but also how they approach benefits to their warehouse employees. One sign that a company is operating at a significantly larger scale is when culture becomes a larger part of the narrative.

Culture of Success

2016 Letter to Shareholders (Link)

This post shows Bezos realization that as a large successful company it is important to keep everyone hungry and fighting like it is day 1. Many of the themes that were raised in the 2017 letter to shareholders were also present in this one. Some key points:

  • True Customer Obsession – customers are alway subconsciously dissatisfied and that presents an opportunity for customers.
  • Resist proxies – that take the form of process, surveys, research and other factors that prevent you from truly understanding customers of a product. It is important to have a vision for helping customers.
  • Embrace external trends – which are normally obvious if you are keeping your eyes open and fighting them can often lead to the death of your company. In this letter he talks extensively about machine learning and artificial intelligence.
  • High velocity decision making – is that big companies make high quality decisions but the problem is that they make the decisions slowly. Important ways around this are don’t use one size fits all decision making process, decide when you have roughly 70% of the information, disagree & commit, and when there is true misalignment it needs to be escalated and addressed immediately.

2017 Letter to Shareholders (Link)

How to achieve high standards is the theme of this letter. Although it is day 1 for Amazon, when the focus is so heavily about culture it is a recognition that the company is at the top so it is important that it makes the right moves to stay there. High standards are teachable and are domain specific. Recognizing the scope for achieving high standards is important so that people are realistic about getting there. I do agree with Jeff that high standards are fun and once you are in a team of all-stars it is hard to work in another type of organization. Wrapping up the letter is a list of the notable achievements that shows the various businesses that Amazon is in.

Conclusion

There it is, a summary of 2 decades of letters to shareholders starting from the beginning to the end. Similar to reading the Berkshire Hathaway letters many insights become apparent. The biggest takeaway is maintaining the consistency of the goals while adapting to the environment and life stage of the company.

Talk to you soon,

Orville | Twitter: @orville_m

Restoring Tensorflow Models

I was restoring a saved model in Tensorflow 1.5 when I was getting the error below.

DataLossError (see above for traceback): Unable to open table file …: Data loss: not an sstable (bad magic number):

Checkpoint v2 saves the checkpoint as 3 files (*.data-00000-of-00001, *.index, *.meta). So when restoring the checkpoint remove the extension and just use the filename *.ckpt

Orville

Numbers & Narratives: Amazon buys Whole Foods and gets options

It often seems like technology company deals are so ugly that only the bankers could love them. That is why it was refreshing to see a deal that makes sense on a lot of levels. Reading a lot of the news reports I feel that there is a deeper narrative at play that has largely been glossed over. So here is a quick summary of the great options that this deal provides Amazon.

In the short-term this deal gives Amazon a power position over their competitors in food delivery. Instacart for example uses Whole Foods stores to get groceries and deliver them to customers. Assuming this deal continues, Amazon will benefit by getting a cut of the Instacart transactions. You can think of this as the Amazon tax. If Instacart starts making more money, Amazon can just raise prices via Whole Foods to get a larger cut of Instacart’s business. Instacart could also decide to move on to another grocer but Whole Foods has unique food selection. Plus, Amazon has a history of letting others user their infrastructure (e.g. AWS, fulfillment centers) and might prefer continually getting money from Instacart transactions rather than kicking them out. This deal also has short-term implications for Amazon’s current partnerships. Here in the greater Seattle area Amazon works with PCC which is another grocery store. Amazon can continue to work with PCC and use their ownership of Whole Foods as leverage in future negotiations. However, PCC will most likely be relegated to support burstable demand when the Whole Foods stores don’t have enough supply or are too busy. From a retail perspective, I wonder if Prime will enable Whole Foods to lose their “Whole Paycheck” reputation by using Prime as a driver of revenue in physical retail while lowering prices in the actual store. This is similar to Costco’s model.

Another interesting option is the stores being the fulfillment centers for groceries and perishable goods. Amazon has 99.4 million square feet of fulfillment centers (this also includes data centers which Amazon bundles in this number) in North America (source) and with Whole Foods they will now have an additional 17.8 million square feet spread across 456 stores in the USA, Canada and Europe (source). Whole Foods stores are known for being in prime locations in urban areas providing ideal locations for Amazon to reach customers in densely populated areas. These areas have higher costs and are not the type of places where a frugal retailer would open a fulfillment center. Closer locations to densely populated areas give Amazon the opportunity to provide other goods more easily to customers in 2 hours or less. Proximity is critical and the grocery store makes having a mini urban fulfillment center a profitable endeavor and not just a cost center. The more stuff you can get in this timeframe, the less likely you will be to go to the store, which leads to buying more stuff from Amazon and the flywheel will pick up speed. Amazon arrives at my door so often now I am starting recognize some of the drivers.

In the long run this is where we can see some of the more futuristic scenarios like Amazon Go. Amazon Go uses artificial intelligence so customers can grab what they need and go. As sexy as these scenarios are unless the technology is ready for prime time it is unlikely this was a strong driver for the deal. According to Amazon’s 8-K, they will be paying $13.7 billion which will be paid for in debt so any efficiency from AI won’t make this deal payoff right away.

Speaking of money, when you look at Amazon’s financials it makes sense that they would borrow the money. With $15.4 billion in cash at the end of the last quarter it highlights that Amazon is a technology company that has the financials of a retailer. Whole Foods free cash flow of $400 million last year is also not a spigot of cash that will make this deal a quick payoff either. Some argue that Amazon got Whole Foods for free because Amazon’s market capitalization increased by $11.2 billion today. This is a cash deal so the currency comparison in this case is apple and oranges unless they are going to have some secondary offerings. More importantly, Amazon has never been overly concerned about profits and this time should not be any different.

The reason why this deal makes sense in my opinion is really about the short, medium and long term optionality that sets them up for success regardless of how the food delivery industry turns out. It is for those reasons that I feel that Amazon truly made a strategic purchase in acquiring Whole Foods (pending regulatory approval of course).

 

Talk to you soon,

Orville | Twitter: @orville_m

 

 

Lessons learned from my Holiday Coding Project

One of my favorite things to do during the holidays is a coding project. Between work and family commitments it is possible to do side projects during the year but harder to just immerse yourself into a problem. My only criteria for the projects is that the technologies that I use have to be different than what I use at work. Normally I pick a technology that I am interested in and then find a project for it. Although I have learned a lot that way the approach I took this year was a lot better.

So here is a quick list of what I learned in how to do my annual holiday project:

  • Be mission driven instead of technology focused. Normally I would think of a technology that I want to learn about and then find a problem to apply it to. This year there was a problem that has been on my mind that I wanted to address. Looking at a problem first you end up finding the best technology to address the problem which is often different than the original technology.
  • Measure twice, cut once. If I was working on a project during crunch time I would use the best solution that I could think of and code away. However, with more time I could look at different alternatives before deciding. For example, I could look at different NLP papers for different approaches. Doing this seems like a waste of time when you don’t have a lot of it. Yet it resulted in me getting more done in less time because I could think about the trade-offs before I write the first line of code. That way I saved time from short-sighted algorithms or rewriting buggy code.
  • Go back to school. Academic papers contain so many great insights that I will probably continue to read these to learn about new topics instead of just blogs. For this project, I spent a lot of time reading about word sense disambiguation, sentiment analysis and opinion mining. Funny how a simple question could lead to reading about so many different things. I have also been a long-time fan of MOOCs. Often I will listen to Udacity lectures to come up to speed on a topic and at times even do some of the assignments. That is how I became familiar with TensorFlow.
  • Manage the clock. Maybe it is because I have been watching so much football (good luck next year UW) but I was always careful to manage my time throughout the project. I was firm that I needed to be code complete a few days before my vacation was over. My original thinking is that the project would go from something fun to a chore if I was trying to complete it while I was busy with work and other life commitments. The added benefit is that I became ruthless in prioritizing my time and even cut “cute” features to focus on what was most important.
  • Focus on Minimal and Viable. When I talk to people building new products they always talk about the MVP. Once you start digging into details it is clear that they are preparing for a moon landing. Focus on the shortest path to viable and feel free to use lots of open source code.
  • Not all open source projects are equal. Open source software has made it possible to create something amazing quickly. Before standing on the shoulder of giants you should make sure that the giant is in shape. For me that involved a few things including when was the last time someone worked on the project, why did the person create the project, has anyone contributed to it, etc. However, I found the most important check was to read the code itself. Was it readable, have error checking and do what it claims to. In a few instances, I leveraged repos that haven’t been touched in years but were of good quality.

This post could easily have been longer but these were a few things that popped out at me. At this point you might be wondering what did I create? Well I had so much fun with this one I will probably release it in the coming weeks and will create another blog post discussing the problem being solved and the solution I proposed. Now I need to finish getting ready for CES… everyday life continues. 🙂

Talk to you soon,

Orville | Twitter: @orville_m

Fixing Python Errors When Installing Tensorflow on El Capitan

While installing TensorFlow on a Mac running El Capitan I kept getting one of those crazy Python errors that are impossible to decipher. Thanks to Stackoverflow which always saves me while coding I found out the problem. It is apparently “due to the System Integrity Protection introduced in OS X El Capitan.” The fix is copied below for the next time I run into this error. Thank you for the tip Kof.

sudo pip install --upgrade $TF_BINARY_URL --user python