Here’s a function: f(x). It’s expensive to calculate, not necessarily an analytic expression, and you don’t know its derivative.
Your task: find the global minima.
This is, for sure, a difficult task, one more difficult than other optimization problems within machine learning. Gradient descent, for one, has access to a function’s derivatives and takes advantage of mathematical shortcuts for faster expression evaluation.
Alternatively, in some optimization scenarios the function is cheap to evaluate. If we can get hundreds of results for variants of an input x in a few seconds, a simple grid search can be employed with good results.
With the advent of convolutional neural networks and transformers to handle complex image recognition and natural language processing tasks, deep learning models have skyrocketed in size.
Although the increase in size is usually associated with an increase in predictive power, this supersizing comes with undesirable costs.
Generative Adversarial Networks (GANs) are an important development in deep learning; its formulation inspired a new generation of model ensembles that interact with each other in ways to produce incredible results.
Competition is in the blood of GAN design — “adversarial” is in its name. Two models, the discriminator and the generator, compete against each other. The ideal result: the generator become great at generating convincing images because it was competing against a worthy opponent (the discriminator).
However, interestingly, it is possible to reframe the GAN without competition. This reframing, in many ways, is advantageous over competitive GAN designs.
Machine learning has often been described as the study of “algorithms that create algorithms”. To a certain extent, this is true — machine learning finds the best model to fit the data. It is the process by which we attain the algorithm that can give predictions on the data.
Yet, there is something slightly off about this description: it erases human task-specific involvement. In reality, humans have a heavy role in ensuring that machine learning algorithms work with the given data. …
Recent research is increasingly investigating how neural networks, being as hyper-parametrized as they are, generalize. That is, according to traditional statistics, the more parameters, the more the model overfits. This notion is directly contradicted by a fundamental axiom of deep learning:
Increased parametrization improves generalization.
Although it may not be explicitly stated anywhere, it’s the intuition behind why researchers continue to push models larger to make them more powerful.
There have been many efforts to explain exactly why this is so. Most are quite interesting; the recently proposed Lottery Ticket Hypothesis states that neural networks are just giant lotteries finding…
The Travelling Salesman Problem was formulated in 1930, and is a classical computer science problem for optimization. It’s a simple problem:
Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?
For instance, we may be given distances between these four cities, as given below. These problems are usually done with hundreds of “cities” (data points) to eliminate the possibility of a brute-force search.
Quantum computing is a buzz-word that’s been thrown around quite a bit. Unfortunately, despite its virality in pop culture and quasi-scientific Internet communities, its capabilities are still quite limited.
As a very new field, quantum computing presents a complete paradigm shift to the traditional model of classical computing. Classical bits — which can be 0 or 1 — are replaced in quantum computing with qubits, which instead holds the value of a probability.
Relying on the quirks of physics at a very, very small level, a qubit is forced into a state of 0 or 1 with a certain probability…
Oh no! You find out that your data is corrupted — there are enough instances of training data being attached to the incorrect label for it to be a significant problem. What should you do?
If you want to be a radical optimist, you could think of this data corruption as a form of regularization — depending on the level of corruption. However, if too many labels are corrupted and it is not done in a balanced way, this view may not be very practical (if it was practical to begin with, of course).
Depending on the problem, though, the…
If you’re a company, you’re continually seeking how to gain more profit. If a company is seeking to expand or change its current business (in both big or small ways), a common solution is experimentation.
Companies can experiment if a change works out or not; if a change does seem to be promising, they can incorporate that change into their broader business. Especially with digital-based companies, experimentation is a driving force of innovation and growth.
A common — and relatively simple — test is the A/B test. Half of users are randomly directed towards layout A, and the other half…
Machine learning has begun to pervade all aspects of life — even those protected by anti-discrimination law. It’s being used in hiring, credit, criminal justice, advertising, education, and more.
What makes this a particularly difficult problem, though, is that machine learning algorithms seem to be more fluid and intangible. Part of the design of complex machine learning algorithms is that they are difficult to interpret and regulate. Machine learning fairness is a rising field that seeks to cement abstract principles of “fairness” into machine learning algorithms.
We’ll look at notions and perspectives on “fairness” three ways. Although there’s plenty more…