Creepy or cool? some AI breakthroughs and formula of life

You keep on hearing about AI is bad and once AGI is around, it will kill us off – paperclip maximization principle is a telling example? Check the below tidbits pushing the envelop.

Music

Historically, people line up to attend concerts of famous artists. Now, there is AI that generates pure gold jazz or what sounds like a mix of jazz and classic. Would you line up to hear these pieces? Would you still line up if you didn’t know whether it’s an algorithm or a human?

Fiction

Do you like Harry Potter? What about this Harry Potter? This algorithm learnt from the first few chapters of J.K.Rowling’s Harry Potter and created a novel of its own. Forget about J.K.Rowling, move on.

Film

TV series are great. Here is a script of Silicon Valley, generated by AI. Or a credibly-looking video from few dozen words (and some prior video training). Hollywood took heed.

Human behaviour

MIT researches created their AI system, which predicts human behaviour by approximating human “intuition” from myriads of data, and pitted it against human teams at data science competitions. The algorithm didn’t get the top score but it beat 615 of the 906 human teams competing. In two of the competitions, it created models that were 94% and 96% as accurate as the winning teams. Whereas the teams of humans required months to build their prediction algorithms, this algorithm trained 2-12 hours.

Cannibalism

Once virtual Adam and Eve (AI bots) were done with apples, they ate Stan, an innocent bystander (another AI bot) that happened to look like an apple.

Formula of life

OK, all the above are creepy, cool, scary, depending on your knowledge, interest and approach to life. But could these AI concepts eventually yield or create actual or natural life forms?

Even Artificial Life community acknowledges that the definition of “life” is contentious.

What Darwin’s theory talks about and what we believe is that there is clear difference between living organisms (in how they come to be and evolve) and everything else (from water vortexes to AI systems to coastal lines of England). Popular hypotheses credit a primordial soup, big bang and a colossal stroke of luck for creation of of life. Erwin Schrödinger framed life merely as physical processes in his treatise “What is Life?”.

But till now we had hard time explaining how (open) thermodynamic systems like our universe and even Earth evolved and how lifeforms evolved in them. We have answers for (close and weak open) ones. Till now.

According to Jeremy England from MIT given it a thermodynamic framing: it’s all about entropy (to create life, one has to decrease entropy). Carbon is not God. In his view, there is one essential difference between living things and inanimate chunks of carbon atoms: the former tend to be much better at capturing energy from their environment and dissipating that energy as heat. He has math formula, which indicates that when a group of atoms is driven by an external source of energy (like the sun) and surrounded by heat (like the ocean or atmosphere), it will often gradually restructure itself in order to dissipate increasingly more energy. This implies that under certain conditions, matter may acquire key physical attribute associated with life.

Now back to AI craze above. Imagine if we could introduce systems that artificially decrease entropy in AI systems as per Jeremy England’s prescriptions, near future could see a new Cambrian explosion of artificially constructed forms of life, which are….. songs, movies, fiction, ….. and perhaps new and better beings!

Here are more creepy/cool AI applications or here. Enjoy!

P.S. Ralph Merkle think of Bitcoin as life:

Bitcoin is the first example of a new form of life. It lives and breathes on the internet. It lives because it can pay people to keep it alive. It lives because it performs a useful service that people will pay it to perform. … It can’t be stopped. It can’t even be interrupted. If nuclear war destroyed half of our planet, it would continue to live, uncorrupted.

Two extremes in blockchain-sphere illustrate current state

How do you know how well an industry/technology/product/.. does?

One way to check is to find out the two contextual (from economic, technological, social, …other perspectives) extremes of that industry/technology/product/…

Let’s have a go at Bitcoin/blockchain.

If you go to Etherscan, click on top menu item Tokens and then View Tokens, and if you search for “fuck,” below is the screenshot from few days ago.

Screen Shot 2018-04-14 at 2.03.53 PM.png

This of course is just one extreme, negative one, illustrating at once absurdity, creativity and ambition one can find in crypto space. While some of those tokens are placeholders, some like FUCK Token have a website and give an impression of an upcoming product. The F-word derived tokens allude to Facebook and Ethereum. Unsurprising.

For a positive extreme, check out DeepRadiology, which employs both deep learning (Yan LeCun, father of deep learning, is their advisor) as well as blockchain to “applying the latest imaging analytic deep learning algorithm capability for all imaging modalities to optimize your facility service needs.“

Screen Shot 2018-04-14 at 2.12.29 PM.png

DeepRadiology uses AI to process myriads of data and blockchain to store and distribute it efficiently and effectively. Both time to process, for example, a CT scan and relevant costs are an order of magnitude less than the incumbents. Everyone wins.

And while Q1 2018 was bearish for all cryptocurrencies, whether rest of 2018 will be bearish or bullish is still a question. Either way, due to maturing crypto market, more educated/pragmatic crypto investors/enthusiasts and less easy-to-get crypto, the focus is now on technology itself, which is what’s needed, in long-term.

Lastly, for your education – and entertainment! – have a read of this list of 100 cryptocurrencies described in 4 words or less.

How GANs can turn AI into a massive force

Deep learning models can already achieve state-of-the-art results in some applications, but their capabilities are still limited. Unlike humans, deep learning models are unable to handle minor changes, and hence can only be applied for specific and narrowly defined tasks.

Consider this conversation of what might be the most sophisticated negotiation software on the planet, which occurred between two AI agents developed at Facebook:

Bob: “I can can I I everything else.”

Alice: “Balls have zero to me to me to me to me to me to me to me to me to.”

At first, they were speaking in plain old English, but researchers realized they forgot to include a reward for sticking to the language. So, the AI agents began to diverge, eventually rearranging legible words into seemingly nonsensical (but, in their perspective, highly efficient) sentences. They invented their own codewords, abbreviations, and structures.

This phenomenon is observed again and again and again.

A vanguard AI technology that can learn, recognize, and generate information on a nearly human level doesn’t exist yet, but we have taken steps toward that direction.

What are generative adversarial networks (GANs)?

Generally intelligent systems must be able to generalize from limited data and learning causal relationships. In 2016, Ian Goodfellow, a fellow at Google Brain, suggested using generative adversarial networks (GANs) as an alternative unsupervised machine learning method. This aimed to address many of the ailing points of the existing methods.

GANs consist of two deep neural networks: generator and discriminator. The generator’s goal is to create data samples that are so indistinguishable to the real ones. The discriminator’s goal is to identify which of the generator’s data samples are real and which are fake.

These two networks compete against each other in a zero-sum game (i.e. one’s loss implies another’s win). Both networks would then become stronger in a relatively short period of time.

Backpropagation is used to update the model parameters and train the neural networks. Over time, the networks learn many features of the provided data. To create realistic forged samples, the generator needs to learn the data’s features and patterns, while the discriminator does the same to correctly distinguish between real and fake samples.

GANs are thus able to overcome the above weaknesses by training (i.e. playing) neural networks against each other, thus learning from each other (which necessitates less data) and eventually performing better in a broader range of problems.

Applications of GANs

There are several types of GANs, and some of its most obvious applications include high-resolution or interactive image generation/blending, image inpainting, image-to-image translation, abstract reasoning, semantic segmentation, video generation, and text-to-image synthesis, among others.

The video game industry is the first area of entertainment to start seriously experimentingusing AI to generate raw content. There’s a huge cost incentive to invest in video game development automation given the US$300 million+ budget of modern AAA video games.

GANs have also been used for text, with less success⏤a bot developed to speak like Friedrich Nietzsche started to speak in a manner similar to the philosopher, but the sentences did not make sense. GANs for voice applications are able to reproduce a given text string to life-like voices with approximately 20 minutes of voice samples, such as these popular impersonations of American presidents Donald Trump and Barack Obama. In the near future, videos can likely be generated just by providing a script.

Goodfellow and his colleagues used GANs for image generation, recognition, and classification by teaching one of the networks to create images of handwritten digits (humans were not able to distinguish real handwritten digits). They also trained a neural network to create images of objects, which humans could only differentiate (from real ones) 78.7 percent of the time. Below are some sample images of faces created entirely by deep convolutional GANs.

Despite all the above achievements, GANs still have weaknesses:

Instability (the generator and the discriminator losses keep oscillating) and non-convergence (to optimum) of the objective function in GANs
Mode collapse (this happens when the generator doesn’t produce diverse images or information)
The possibility that either the generator or the discriminator becomes too strong as compared to the others during training
The possibility that either the generator or the discriminator never learns beyond a certain point

An existential threat

Do GANs and AI in general pose an existential threat to humanity? Elon Musk thinks so. Since 2014, he has been advocating adoption of AI regulations by authorities around the world. Recently, he reiterated the urgent need to be proactive in regulation.

“AI is a fundamental risk to the existence of human civilization,” Musk tells US politicians recently.

His concerns stem from the rapid developments related to GANs, which might push humanity toward the inception of artificial general intelligence. While AI regulations may serve as safeguards, AI is still far from the fictitious depictions seen frequently in Hollywood sci-fi movies.

(By the way, Facebook ultimately opted to require its negotiation bots to speak in plain old English.)

Here are some recommended resources for GAN:

This article originally appeared on Tech in Asia.

How AI systems learn: approaches and concepts

As you know, goal of AI learning is generalisation, but one major issue is that data alone will never be enough, no matter how much of it is available. AI systems need both data and they need to learn based on data in order to generalise.

So let’s look at how AI systems learn. But before we do that, what are the few different and prevalent AI approaches?

Neural networks model a brain learning by example―given a set of right answers, a neural network learns the general patterns. Reinforcement Learning models a brain learning by experience―given some set of actions and an eventual reward or punishment, it learns which actions are ‘good’ or ‘bad,’ as relevant in context. Genetic Algorithms model evolution by natural selection―given some set of agents, let the better ones live and the worse ones die.

Usually, genetic algorithms do not allow agents to learn during their lifetimes, while neural networks allow agents to learn only during their lifetimes. Reinforcement learning allows agents to learn during their lifetimes and share knowledge with other agents.

Consider learning a Boolean function of (say) 100 variables from a million examples. There are 2100 ^ 100 examples whose classes you don’t know. How do you figure out what those classes are? In the absence of further information, there is no way to do this that beats flipping a coin. This observation was first made (in somewhat different form) by David Hume over 200 years ago, but even today many mistakes in ML stem from failing to appreciate it. Every learner must embody some knowledge/assumptions beyond the data it’s given in order to generalise beyond it.

This seems like rather depressing news. How then can we ever hope to learn anything? Luckily, the functions we want to learn in the real world are not drawn uniformly from the set of all mathematically possible functions. In fact, very general assumptions—like similar examples having similar classes, limited dependences, or limited complexity—are often enough to do quite well, and this is a large part of why ML has been so successful to date.

AI systems use induction, deduction, abduction and other methodologies to collect, analyse and learn from data, allowing generalisation to happen.

Like deduction, induction (what learners do) is a knowledge lever: it turns a small amount of input knowledge into a large amount of output knowledge. Induction (despite its limitations) is a more powerful lever than deduction, requiring much less input knowledge to produce useful results, but it still needs more than zero input knowledge to work.

Abduction is sometimes used to identify faults and revise knowledge based on empirical data. For each individual positive example that is not derivable from the current theory, abduction is applied to determine a set of assumptions that would allow it to be proven. These assumptions can then be used to make suggestions for modifying the theory. One potential repair is to learn a new rule for the assumed proposition so that it could be inferred from other known facts about the example. Another potential repair is to remove the assumed proposition from the list of antecedents of the rule in which it appears in the abductive explanation of the example – parsimonious covering theory (PCT). Abductive reasoning is useful in inductively revising existing knowledge bases to improve their accuracy. Inductive learning can be used to acquire accurate abductive theories.

One key concept in AI is classifier. Generally, AI systems can be divided into two types: classifiers (“if shiny and yellow then gold”) and controllers (“if shiny and yellow then pick up”). Controllers also include classify-ing conditions before inferring actions. Classifiers are functions that use pattern matching to determine a closest match. They can be tuned according to examples known as observations or patterns. In supervised learning, each pattern belongs to a certain predefined class. A class can be seen as a decision that has to be made. All the observations combined with their class labels are known as data set. When a new observation is made, it is classified based on previous experience.

Classifier performance depends greatly on the characteristics of the data to be classified. The most widely used classifiers use kernel methods to be trained (i.e. to learn). There is no single classifier that works best on all given problems – “no free lunch“. Determining an optimal classifier for a given problem is still more an art than science.

The following formula sums up the process of AI learning.

LEARNING = REPRESENTATION + EVALUATION + OPTIMISATION

Representation. A classifier must be represented in some formal language that the computer can handle. Conversely, choosing a representation for a learner is tantamount to choosing the set of classifiers that it can possibly learn. This set is called the hypothesis space of the learner. If a classifier is not in the hypothesis space, it cannot be learned. A related question is how to represent the input, i.e., what features to use.

Evaluation. An evaluation function is needed to distinguish good classifiers from bad ones. The evaluation function used internally by the algorithm may differ from the external one that we want the classifier to optimise, for ease of optimisation (see below) and due to the issues discussed in the next section.

Optimisation. We need a method to search among the classifiers in the language for the highest-scoring one. The choice of optimisation technique is key to the efficiency of the learner, and also helps determine the classifier produced if the evaluation function has more than one optimum. It is common for new learners to start out using off-the-shelf optimisers.

Key criteria for choosing a representation is which kinds of knowledge are easily expressed in it. For example, if we have knowledge about probabilistic dependencies, graphical models are a good fit. And if we have knowledge about what kinds of preconditions are required by each class, “IF . . . THEN . . .” rules may be the the best option. The most useful learners in this regard are those that don’t just have assumptions hard-wired into them, but allow us to state them explicitly, vary them widely, and incorporate them dynamically into the learning.

What if the knowledge and data we have are not sufficient to completely determine the correct classifier? Then we run the risk of just inventing a classifier (or parts of it) that is not grounded in reality, and is simply encoding random quirks in the data. This problem is called overfitting, and is the bugbear of ML. When a learner outputs a classifier that is 100% accurate on the training data but only 50% accurate on real data, when in fact it could have output one that is 75% accurate on both, it has overfit.

One way to understand overfitting is by decomposing generalisation error into bias and variance. Bias is a learner’s tendency to consistently learn the same wrong thing. Variance is the tendency to learn random things irrespective of the real signal. Cross-validation can help to combat overfitting, but it’s no panacea, since if we use it to make too many parameter choices it can itself start to overfit. Besides cross-validation, there are many methods to combat overfitting, the most popular one is adding a regularisation term to the evaluation function. Another option is to perform a statistical significance test like chi-square before adding new structure, to decide whether the distribution of the class really is different with and without this structure.

Sources and relevant articles:

Limits of deep learning and way ahead

Artificial intelligence has reached peak hype. News outlets report that companies have replaced workers with IBM Watson and algorithms are beating doctors at diagnoses. New AI startups pop up every day – especially in China – and claim to solve all your personal and business problems with machine learning.

Ordinary objects like juicers and wifi routers suddenly advertise themselves as “powered by AI”. Not only can smart standing desks remember your height settings, they can also order you lunch.

Much of the AI hubbub is generated by reporters who’ve little or superficial knowledge about the subject matter and startups hoping to be acquihired for engineering talent despite not solving any real business problems. No wonder there are so many misconceptions about what A.I. can and cannot do.

Deep learning will shape the future ahead

Neural networks were invented in the 60s, but recent boosts in big data and computational power made them actually useful. The results are undeniably incredible. Computers can now recognize objects in images and video and transcribe speech to text better than humans can. Google replaced Google Translate’s architecture with neural networks and now machine translation is also closing in on human performance.

The practical applications are mind-blowing. Computers can predict crop yield better than the USDA and indeed diagnose cancer more accurately than expert physicians.

DARPA, the creator of Internet and many other modern technologies, sees three waves of AI:

Handcrafted knowledge, or expert systems like IBM’s DeepBlue or IBM Watson;
Statistical learning, which includes machine learning and deep learning;
Contextual adaption, which involves constructing reliable, explanatory models for real world phenomena using sparse data, like humans do.

As part of the current second wave of AI, deep learning algorithms work well because of what the report calls the “manifold hypothesis.” This refers to how different types of high-dimensional natural data tend to clump and be shaped differently when visualised in lower dimensions.

darpa_manifolds_750px_web

By mathematically manipulating and separating data clumps, deep neural networks can distinguish different data types. While neural networks can achieve nuanced classification and predication capabilities they are what is called “spreadsheets on steroids.”

darpa_manifolds_separation_750px_web

Deep learning algorithms have deep learning problems

At the recent AI By The Bay conference, one expert and inventor of widely used deep learning library Keras, Francois Chollet, thinks that deep learning is simply more powerful pattern recognition vs. previous statistical and machine learning methods and that the most important problems for AI today are abstraction and reasoning. Current supervised perception and reinforcement learning algorithms require lots of training, are terrible at planning, and are only doing straightforward pattern recognition.

By contrast, humans “learn from very few examples, can do very long-term planning, and are capable of forming abstract models of a situation and manipulate these models to achieve extreme generalisation.”

Even simple human behaviours are laborious to teach to a deep learning algorithm. Let’s examine the task of not being hit by a car as you walk down the road.

Humans only need to be told once to avoid cars. We’re equipped with the ability to generalise from just a few examples and are capable of imagining (i.e. modelling) the dire consequences of being run over. Without losing life or limb, most of us quickly learn to avoid being overrun by motor vehicles.

Let’s now see how this works out if we train a computer. If you go the supervised learning route, you need big data sets of car situations with clearly labeled actions to take, such as “stop” or “move”. Then you’d need to train a neural network to learn the mapping between the situation and the appropriate action. If you go the reinforcement learning route, where you give an algorithm a goal and let it independently determine the ideal actions to take, the computer will “die” many times before learning to avoid cars in different situations.

While neural networks achieve statistically impressive results across large sample sizes, they are “individually unreliable” and often make mistakes humans would never make, such as classify a toothbrush as a baseball bat.

misclassification_darpa_web

Your results are only as good as your data

Neural networks fed inaccurate or incomplete data will simply produce the wrong results. The outcomes can be both embarrassing and damaging. In two major PR debacles, Google Images incorrectly classified African Americans as gorillas, while Microsoft’s Tay learned to spew racist, misogynistic hate speech after only hours training on Twitter.

Undesirable biases may even be implicit in our input data. Google’s massive Word2Vec embeddings are built off of 3 million words from Google News. The data set makes associations such as “father is to doctor as mother is to nurse” which reflect gender bias in our language.

For example, researchers go to human ratings on Mechanical Turk to perform “hard de-biasing” to undo the associations. Such tactics are essential since word embeddings not only reflect stereotypes but can also amplify them. If the term “doctor” is more associated with men than women, then an algorithm might prioritise male job applicants over female job applicants for open physician positions.

Neural networks can be tricked or exploited

Ian Goodfellow, inventor of GANs, showed that neural networks can be deliberately tricked with adversarial examples. By mathematically manipulating an image in a way that is undetectable to the human eye, sophisticated attackers can trick neural networks into grossly misclassifying objects.

ian_goodfellow_adversarial_attacks

The dangers such adversarial attacks pose to AI systems are alarming, especially since adversarial images and original images seem identical to us. Self-driving cars could be hijacked with seemingly innocuous signage and secure systems could be compromised by data that initially appears normal.

Potential solutions

How can we overcome the limitations of deep learning and proceed towards general artificial intelligence? Chollet’s initial plan is using “super-human pattern recognition like deep learning to augment explicit search and formal systems”, starting with the field of mathematical proofs. Automated Theorem Provers (ATPs) typically use brute force search and quickly hit combinatorial explosions in practical use. In the DeepMath project, Chollet and his colleagues used deep learning to assist the proof search process, simulating a mathematician’s intuitions about what lemmas might be relevant.

Another approach is to develop more explainable models. In handwriting recognition, neural nets currently need to be trained on many thousand examples to perform decent classification. Instead of looking at just pixels, generative models can be taught the strokes behind any given character and use this physical construction information to disambiguate between similar numbers, such as a 9 or a 4.

Yann LeCun, AI boss of Facebook, proposes “energy-based models” as a method of overcoming limits in deep learning. Typically, a neural network is trained to produce a single output, such as an image label or sentence translation. LeCun’s energy-based models instead give an entire set of possible outputs, such as the many ways a sentence could be translated, along with scores for each configuration.

Geoffrey Hinton, called the “father of deep learning” wants to replace neurons in neural networks with “capsules” which he believes more accurately reflect the cortical structure in the human mind. Evolution must have found an efficient way to adapt features that are early in a sensory pathway so that they are more helpful to features that are several stages later in the pathway. He thinks capsule-based neural network architectures will be more resistant to the adversarial attacks.

Perhaps all of these approaches to overcoming the limits of deep learning have a value. Perhaps none of them do. Only time and continued investment in AI will tell. But one thing seems quite certain: it might be impossible to achieve general intelligence simply by scaling up today’s deep learning techniques.

Survival of blockchain and Ethereum vs. alternatives

As outlined in my previous post, blockchain faces number of fundamental – technological, cultural, and business – issues before it becomes mainstream. However, potential of blockchain, especially if it were coupled with AI, cannot be ignored. The potent combination of blockchain and AI can revolutionise healthcare, science, government, autonomous driving, financial services, and a number of key industries.

Discussions continue about blockchain’s ability to lift people out of poverty through mobile transactions, improve accounting for tourism in second-world countries, and make governance transparent with electronic voting. But, just like the complementary – and equally hyped – technologies of AI, IoT, and big data, blockchain technology is emerging and yet unproven at scale. Additional, socio-political as well as economic roadblocks remain to blockchain’s widespread adoption and application:

1. Disparity of computer power and electricity distribution

Bitcoin transactions on blockchain require “half the energy consumption of Ireland”. This surge of electricity use is simply impossible in developing countries where the resource is scarce and expensive. Even if richer countries assist and invest in poorer ones, the UN is concerned that elite, external ownership of critical infrastructure may lead to a digital form of neo-colonialism.

2. No mainstream trust for blockchain

Bitcoin inspired the explosive attention on blockchain, but there isn’t currently much trust in the technology – as it’s relatively new, unproven and has technical problems and limitations – outside of digital currencies. With technologies still in their infancy, blockchain companies are slow to deliver on promises. This turtle pace does not satisfy investors seeking quick ROI. Perhaps the largest, challenge to blockchain adoption is the massive transformation in architectural, regulatory, and business management practices required to deploy the technology at scale. Even if such large-scale changes are pulled off, society may experience a culture shock from switching to decentralised, automated systems after a history of only centralised ones.

3. Misleading and misguided ‘investments’

Like the Internet, blockchain technology is most powerful when everyone is on the same network. The Internet grew in fits and starts, but was ultimately driven by the killer app of email. While Bitcoin and digital currencies are the “killer app” of blockchain, we’ve already seen aggressive investments in derivative cryptocurrencies peter out.

Many technologies also call themselves “blockchain” to capitalise on hype and capture investment, but are not actual blockchain implementations. But, even legitimate blockchain technologies suffer from the challenge of timing, often launching in a premature ecosystem unable to support adoption and growth.

4. Cybersecurity risks and flaws

The operational risks of cybersecurity threats to blockchain technology make early adopters hesitate to engage. Additionally, bugs in the technology are challenging to detect, yet caused outsized damage. Getting the code right is critical, but this requires time and talent.

While relatively more known Bitcoin’s PoW-based blockchain systems and Ethereum see limelight and PR, there are number of alternative blockchain protocols and approaches, which are scalable and solve many of fundamental challenges the incumbents face.

PoW and Ethereum alternatives

Disclaimer: I neither condone, engage nor promote any of the below alternatives but simply provide information as found on websites, articles and social media of relevant entities and therefore not responsible whether the information thus provided is accurate and realistic.

1. BitShares, SteemIt (based on Steem) and EOS white papers which are all based on Delegated Proof of Stake (DPOS). DPOS enables BitShares to process 180k transactions per second, which is more than 5x NASDAQ transactions/s. Steem and Bitshares process more transactions/day than the top 20 blockchains combined.

In DPOS, each 2 seconds – Bitcoin’s PoW generates a new block each 10 minutes – a new block is created, through witnesses (stakeholders can elect any number of witnesses to generate blocks – currently 21 in Steem and 25 in BitShares). DPOS is using pipelining to increase scalability. Those 20 witnesses generate their own block in a specified order, that holds for a few rounds (hence the pipelining), after the order is changed. DPOS confirms transactions with 99.9% certainty in an average of just 1.5 seconds while degrading in a graceful, detectable manner that is trivial to recover from. It is easy to increase the scalability of this schema, by introducing additional witnesses either by increasing the pipeline length or using sharding to allow to generate in a deterministic/verifiable way few blocks during the same epoch.

2. IOTA (originally designed to be financial system for IoT) is a new blockless distributed ledger which is scalable, lightweight and fee-less. It’s based on DAG, and its performance INCREASES the bigger the networks gets.

3. Ardor solves the common (to all blockchains) bloat problem, relying on an innovative parent/child chain architecture and pruning of the child chain transactions. It shares some similarities with plasma.io, based on NXT blockchain technology and already running on testnet.

4. LTCP uses State Channels by stripping 90% of the transaction data from the blockchain. LTCP combined with RSK’s Lumino network or Ethereum’s Raiden network can serve 1 billion users in both retail and online payments.

5. Stellar runs off of Stellar Consensus Protocol (SCP) and is scalable, robust, got a distributed exchange and is easy to use. SCP implements “Federated Byzantine Agreement,” a new approach to achieving consensus in a real-world network that includes faulty “Byzantine” nodes with technical errors or malicious intent. To tolerate Byzantine failures, SCP is designed not to require unanimous consent from the complete set of nodes for the system to reach agreement, and to tolerate nodes that lie or send incorrect messages. In the SCP, individual nodes decide which other participants they trust for information, and partially validate transactions based on individual “quorum slices.” The systemwide quorums for valid transactions result from the individual quorum decisions by individual nodes.

6. A thin client is a program which connects to the Bitcoin network but which doesn’t fully validate transactions or blocks, i.e it’s a client to the full nodes on the network. Most thin clients use the Simplified Payment Verification (SPV) method to verify that confirmed transactions are part of a block. To do this, they connect to a full node on the blockchain network and send it a filter (Bloom filter) that will match any transactions affecting the client’s wallet. When a new block is created, the client requests a special lightweight version of that block: Merkle block, which includes a block header, a relatively small number of hashes, a list of one-bit flags, and a transaction count. Using this information—often less than 1 KB of data—the client can build a partial Merkle tree to the block header. If the hash of the root node of the partial Merkle tree equals the hash of Merkle root in the block header, the SPV client has cryptographic proof that the transaction was included in that block. If that block then gets 6 confirmations at the current network difficulty, then the client has extremely strong proof that the transaction was valid and is accepted by the entire network.

The only major downside of the SPV method is that full nodes can simply not tell the thin clients about transactions, making it look like the client hasn’t received bitcoins or that a transaction the client broadcast earlier hasn’t confirmed.

7. Mimir proposes a network of Proof of Authority micro-channels for using in generating a trustless, auditable, and secure bridge between Ethereum and the Internet. This system aims to establish Proof of Authority for individual validators via a Proof-of-Stake contract registry located on Ethereum itself . This Proof-of-Stake contract takes stake in the form of Mimir B2i Tokens. These tokens serve as collateral that may be repossessed in the event of malicious actions. In exchange for serving requests against the Ethereum blockchain, validators get paid in Ether.

8. Ripple’s XRP ledger already handles 1,500 transactions/second on-chain, which keeps on being improved (was 1,000 transactions/sec at the beginning of 2017).

9. QTUM, a hybrid blockchain platform whose technology combines a fork of bitcoin core, an Account Abstraction Layer allowing for multiple Virtual Machines including the Ethereum Virtual Machine (EVM) and Proof-of-Stake consensus aimed at tackling industry use cases.

10. Blocko, which has enterprise and consumer grade layers and has already successfully piloted/launched products (dApps) with/for Korea Exchange, LotteCard and Huyndai.

11. Algorand uses “cryptographic sortition” to select players to create and verify blocks. It scales on demand and is more secure and faster than traditional PoW and PoS systems. While most PoS systems rely on some type of randomness, algorand is different in that you self-select by running the lottery on your own computer (not on cloud or public chain). The lottery is based on information in the previous block, while the selection is automatic (involving no message exchange) and completely random. Thanks David Deputy for pointing out this platform!!!

12. NEO, also called “Ethereum of China,” is a non-profit community-based blockchain project that utilizes blockchain technology and digital identity to digitize assets, to automate the management of digital assets using smart contracts, and to realize a “smart economy” with a distributed network.

Blockchain + AI = ?

What happens when two major technological trends see an synergy or overlap in usage or co-development?

We have blockchain’s promise of near-frictionless value exchange and AI’s ability to conduct analysis of massive amounts of data. The joining of the two could mark the beginning of an entirely new paradigm. We can maximize security while remaining immutable by employing AI agents that govern the chain. With more companies and institutions adopting blockchain-based solutions, and more complex, potentially critical data stored in distributed ledgers, there’s a growing need for sophisticated analysis methods, which AI technology can provide.

The combination of AI and blockchain is fueling the onset of the “Fourth Industrial Revolution“ by reinventing economics and information exchange.

1. Precision medicine

Google DeepMind is developing an “auditing system for healthcare data”. Blockchain will enable the system to remain secure and shareable, while AI will allow medical staff to obtain analytics on medical predictions drawn from patient profiles.

2. Wealth and investment management

State Street is issuing blockchain-based indices. Data is stored and made secure using blockchain and analyzed using AI. It reports that 64% of wealth and asset managers polled expected their firms to adopt blockchain in the next five years. Further, 49% of firms said they expect to employ AI. As of 01.2017, State Street had 10 blockchain POC’s in the works.

3. Smart urbanity

To supply the energy, distributed blockchain technology is implemented for transparent and cost-effective transactions between producers and consumers, while machine learning algorithms can even hone in on transactions to estimate pricing. Green-friendly AI and blockchain help reduce energy waste and optimize energy trade. For example, an AI system governing a building can oversee energy use by counting in factors like the presence and number of residents, seasons, and traffic information.

4. Legal diamonds

IBM Watson is developing Everledger using blockchain technology to tackle fraud in the diamond industry, and deploying cognitive analytics to heavily “cross-check” regulations, records, supply-chain, and IoT data in the blockchain environment.

5. More efficient science

The “file-drawer problem“ in academia is when researchers don’t publish “non-result” experiments. Duplicate experiments and a lack of knowledge follow, trampling scientific discourse. To resolve this, experimental data can be stored in a publicly accessible blockchain. Data analytics could also help identifying elements like how many times the same experiment has happened or what the probable outcome of a certain experiment is.

There are forecasts that AI will play a big role in science once “smart contracts” transacted by blockchain require smarter “nodes” that function in a semi-autonomous way. Smart contracts (essentially, pieces of software) simulate, enforce and manage contractual agreements and can have wide-ranging applications when academics embrace the blockchain for knowledge transfer and development.

6. IP rights management

Digitalization has introduced complicated digital rights to IP management, and when AI learns the rules of the game, it can identify actors who break IP laws. As for IP contract management, for music (and other content) industry, blockchain enables immediate payment methods to artists and authors. One artist recently suggested the blockchain could help musicians simplify creative collaboration and making money. Ujo Music is making use of the Ethereum blockchain platform for song distribution.

7. Computational finance

Smart contracts could take center stage where transparent information is crucial for trust in financial services. Financial transactions may no longer rely on a human “clearing agent” as they automatized, performing better and faster. But since confidence in transactions remains dependent on people, AI can help monitor human emotions and predict the most optimal trading environment. Thus, “algotrading” can be powered by algorithms that trade based on investment patterns correlated with emotions.

8. Data and IoT management

Organizations are increasingly looking to adopt blockchain technologies for alternative data storage. And with heaps of data distributed across blockchain ledgers, the need for data analytics with AI is growing. IBM Watson merged blockchain with AI via the Watson IoT group. In this, an artificially intelligent blockchain lets joint parties collectively agree on the state of the device and make decisions on what to do based on language coded into a smart contract. Using blockchain tech, artificially intelligent software solutions are implemented autonomously. Risk management and self-diagnosis are other use cases being explored.

9. Blockchain-As-A-Service software

Microsoft is integrating “BaaS modules” (based on the public Ethereum) in its Azure that users can create test environments for. Blockchains are cheaper to create and test, and in Azure they come with reusable templates and artifacts.

10. Governance 3.0

Blockchain and AI could contribute to the development of direct democracy. They can transfer big hordes of data globally, tracing e-voting procedures and displaying them publicly so that citizens can engage in real-time. Democracy Earth Foundation aspires to “hack democracy“ by advocating open-source software, peer-to-peer networks, and smart contracts. The organization also aims to fight fake identities and reclaim individual accountability in the political sphere. IPDB is a planetary-scale blockchain database built on BigchainDB. It’s a ready-to-use public network with a focus on strong governance.

Is self-play the future of (most) AI?

Go is game whose number of possible moves – more than chess at 10¹⁷⁰ – is greater than the number of atoms in the universe.

AlphaGo, the predecessor to AlphaGo Zero, crushed 18-time world champion Lee Sedol and the reigning world number one player, Ke Jie. After beating Jie earlier this year, DeepMind announced AlphaGo was retiring from future competitions.

Now, an even more superior competitor, AlphaGo Zero, could beat the version of AlphaGo that faced Lee Sedol after training for just 36 hours and earned beat its predecessor by 100-0 score after 72 hours. Interestingly, AlphaGo Zero didn’t learn from observing humans playing against each other – unlike AlphaGo – but instead, its neural network relies on an old technique in reinforcement learning: self-play. Self-play means agents can learn behaviours that are not hand-coded on any reinforcement learning task, but the sophistication of the learned behaviour is limited by the sophistication of the environment. In order for an agent to learn intelligent behaviour in a particular environment, the environment has to be challenging, but not too challenging.

Essentially, self-play means that AlphaGo Zero plays against itself. During training, it sits on each side of the table: two instances of the same software face off against each other. A match starts with the game’s black and white stones scattered on the board, placed following a random set of moves from their starting positions. The two computer players are given the list of moves that led to the positions of the stones, and then are each told to come up with multiple chains of next moves along with estimates of the probability they will win by following through each chain. The next move from the best possible chain is then played, and the computer players repeat the above steps, coming up with chains of moves ranked by strength. This repeats over and over, with the software feeling its way through the game and internalizing which strategies turn out to be the strongest.

AlphaGo Zero did start from scratch with no experts guiding it. And it is much more efficient: it only uses a single computer and four of Google’s custom TPU1 chips to play matches, compared to AlphaGo’s several machines and 48 TPUs. Since Zero didn’t rely on human gameplay, and a smaller number of matches, its Monte Carlo tree search is smaller. The self-play algorithm also combined both the value and policy neural networks into one, and was trained on 64 GPUs and 19 CPUs by playing nearly five million games against itself. In comparison, AlphaGo needed months of training and used 1,920 CPUs and 280 GPUs to beat Lee Sedol.

AlphaGo combines the two most powerful ideas about learning to emerge from the past few decades: deep learning and reinforcement learning. In the human brain, sensory information is processed in a series of layers. For instance, visual information is first transformed in the retina, then in the midbrain, and then through many different areas of the cerebral cortex. This creates a hierarchy of representations where simple, local features are extracted first, and then more complex, global features are built from these. The AI equivalent is called deep learning; deep because it involves many layers of processing in simple neuron-like computing units.

But to survive in the world, animals need to not only recognise sensory information, but also act on it. Generations of scientists have studied how animals learn to take a series of actions that maximise their reward. This has led to mathematical theories of reinforcement learning that can now be implemented in AI systems. The most powerful of these is temporal difference learning, which improves actions by maximising its expectation of future reward. It is thus that, among others, it even discovered for itself, without human intervention, classic Go moves such as fuseki opening tactics and life and death.

So are there problems to which the current algorithms can be fairly immediately applied?

One example may be optimisation in controlled industrial settings. Here the goal is often to complete a complex series of tasks while satisfying multiple constraints and minimising cost. As long as the possibilities can be accurately simulated, self-play-based algorithms can explore and learn from a vastly larger space of outcomes than will ever be possible for humans.

Researchers at OpenAI have already experimented with the same technique to train bots to play Dota 2, and published a paper on competitive self play. There are other experiments, such as this one, showing how self-play/teaching AI is better at predicting heart attacks.

AlphaGo Zero’s success bodes well for AI’s mastery of games. But it would be a mistake to believe that we’ve learned something general about thinking and about learning for general intelligence. This approach won’t work in more ill-structured problems like natural-language understanding or robotics, where the state space is more complex and there isn’t a clear objective function.

Unsupervised training is the key to ultimately creating AI that can think for itself, but more research is needed outside of the confines of board games and predefined objective functions” before computers can really begin to think outside the box.

DeepMind says the research team behind AlphaGo is looking to pursue other complex problems, such as finding new cures for diseases, quantum chemistry and material design.

Although it couldn’t sample every possible board position, AlphaGo’s neural networks extracted key ideas about strategies that work well in any position. Unfortunately, as yet there is no known way to interrogate the network to directly read out what these key ideas are. If we learn the game of Go purely through supervised learning, the best one could hope to do would be as good as the human one is imitating. Through self-play (and thus unsupervised learning), one could learn something completely novel and create or catalyse emergence.

DeepMind’s self-play approach is not the only way to push the boundaries of AI. Gary Marcus, a neuroscientist at NYU, has co-founded Geometric Intelligence (acquired by Uber), to explore learning techniques that extrapolate from a small number of examples, inspired by how children learn. He claimed to outperform both Google’s and Microsoft’s deep-learning algorithms.