Derrick Mwiti - Machine learning nuggets

Sign in Subscribe

Derrick Mwiti

Google Developer Expert - Machine Learning

GPT Instruction Fine-tuning With Keras

GPT Instruction Fine-tuning With Keras

Fine-tuning has become the new training because training large language models (LLMs) from scratch is computationally expensive. It also requires collecting and preparing large datasets which is also time intensive. These resources are only the purview of a few individuals and organizations. Fortunately, there are many open-source LLMs that one

How to Detect AI Generated Content With TensorFlow

How to Detect AI Generated Content With TensorFlow

With the plethora of open-source language models, it's incredibly difficult to determine if a piece of text is AI generated. However, with a good dataset, you can train a model in TensorFlow to detect if a large language model generated text. It's such an interesting problem

Convolutional Neural Networks in JAX: Ultimate Guide

Convolutional Neural Networks in JAX: Ultimate Guide

JAX is a high performance library that offers accelerated computing through XLA and Just In Time Compilation. It also has handy features that enable you to write one codebase that can be applied to batches of data and run on CPU, GPU, or TPU. However, one of its biggest selling

Implementing Transformer decoder for text generation in Keras and TensorFlow

TensorFlow Featured

Implementing Transformer decoder for text generation in Keras and TensorFlow

The recent wave of generative language models is the culmination of years of research starting with the seminal "Attention is All You Need" paper. The paper introduced the Transformer architecture that would later be used as the backbone for numerous language models. These text generation language models are

Text Classification With BERT and KerasNLP

Text Classification With BERT and KerasNLP

BERT is a popular Masked Language Model. Some words are hidden from the model and trained to predict them. The model is bidirectional, meaning it has access to the words to the left and right, making it a good choice for tasks such as text classification. Training BERT can quickly

How to Build Large Language Model Applications with PaLM API and LangChain

How to Build Large Language Model Applications with PaLM API and LangChain

You can now use Generative AI Studio on Vertex AI to prompt, tune and deploy Google's foundational models, including PaLM 2, Imagen, Codey, and Chirp. You can easily design and fine-tune your prompt and copy the code required to deploy the solution. Leveraging a foundational model is a

How to Perform Image Augmentation With KerasCV

How to Perform Image Augmentation With KerasCV

Training computer vision models with little data can lead to poor model performance. This problem can be solved by generating new data samples from the existing images. For example, you can create new images by flipping and rotating the existing ones. Generating new image samples from existing ones is known

How to Build LLM Applications With LangChain and Openai

How to Build LLM Applications With LangChain and Openai

LangChain is an open-source tool for building large language model (LLM) applications. It supports a variety of open-source and closed models, making it easy to create these applications with one tool. Some of the modules in Langchain include: * Models for supported models and integrations * Prompts for making it easy to

How to Train Stable Diffusion With Keras

TensorFlow Featured

How to Train Stable Diffusion With Keras

Image generation models are causing a sensation worldwide, particularly the powerful Stable Diffusion technique. With Stable Diffusion, you can generate images with your laptop, which was previously impossible. Here's how diffusion models work in plain English: 1. Generating images involves two processes. Diffusion adds noise gradually to the

TensorFlow Featured

How to Generate Images with Variational Autoencoders(VAE) (Create VAE from scratch using Keras and TensorFlow)

An autoencoder takes an input image and creates a low-dimensional representation, i.e., a latent vector. This vector is then used to reconstruct the original image. Regular autoencoders get an image as input and output the same image. However, Variational AutoEncoders (VAE) generate new images with the same distribution as

Distributed training with TensorFlow: How to train Keras models on multiple GPUs

Training computer vision models requires a lot of time because of the size of the models and image data. Therefore, training these models can take prolonged periods of time, especially when training on a single GPU. You can reduce the training time by distributing the training across several GPUs. This

Technical Writing

Technical Writing: Ultimate Beginners Guide

I have created technical content for various companies over the last 5 years. Educating developers is how technology companies use to grow their communities. Developers hate being sold to, so this is the best way to get developers to use a company's product. The product should solve a

Writing for Data Scientists

Download Writing for Data Scientists sample Writing for Data Scientists Free Sample Writing for Data Scientists - Free Sample .pdf337 KBdownload-circle I earned $300 for my first paid data science and machine learning article. I get paid between $250 and $500 for each data science article I write. In this

TensorFlow Featured

Create U-Net from scratch (Image segmentation with U-Net with Keras and TensorFlow)

In the Implementing Fully Convolutional Networks (FCNs) from scratch in Keras and TensorFlow article, you saw how to build an image segmentation model with FCNs. However, due to the model's limitations, it did not perform very well in the segmenting task. In this post, you will see how

Implementing Fully Convolutional Networks (FCNs) from scratch in Keras and TensorFlow (Build image segmentation model from scratch)

Implementing Fully Convolutional Networks (FCNs) from scratch in Keras and TensorFlow (Build image segmentation model from scratch)

In 2014, Jonathan Long, Evan Shelhamer, and Trevor Darrell proposed solving image segmentation problems using Fully Convolutional Neural Networks(FCNs). FCNs have no fully connected layers. Image segmentation involves making a prediction for each pixel in an image. FCNs can accept images of any size because they don't

How to become a Kaggle Competitions Grandmaster

How to become a Kaggle Competitions Grandmaster

writtencast 001 - Shujun He In this inaugural interview of the writtencast, I am joined by Shujun He. Shujun is a P.hD. student at Texas A&M University and a Kaggle Competitions Grandmaster. To become a Kaggle Competitions Grandmaster you need 5 gold medals and at least one

Free data science course

Join my free data science and machine learning email course with Python. Each day I will share a new lesson and provide code examples and notebooks to help you in your data science journey. The course covers beginner to advanced concepts in data science and machine learning. I will provide

Object detection with Vision Transformer for Open-World Localization(OWL-ViT)

Object detection with Vision Transformer for Open-World Localization(OWL-ViT)

Convolutional neural networks have been the primary networks applied in objection detection. Recently, Transformers have gained popularity in natural language processing and computer vision. In this article, we explore the use of the OWL-ViT in object detection. Let’s get started. What is a Vision Transformer? Transformers have been widely

Train ResNet in Flax from scratch(Distributed ResNet training)

Train ResNet in Flax from scratch(Distributed ResNet training)

Apart from designing custom CNN architectures, you can use architectures that have already been built. ResNet is one such popular architecture. In most cases, you'll achieve better performance by using such architectures. In this article, you will learn how to perform distributed training of a ResNet model in

Handling state in JAX & Flax (BatchNorm and DropOut layers)

Handling state in JAX & Flax (BatchNorm and DropOut layers)

Jitting functions in Flax makes them faster but requires that the functions have no side effects. The fact that jitted functions can't have side effects introduces a challenge when dealing with stateful items such as model parameters and stateful layers such as batch normalization layers. In this article,

Transfer learning with JAX & Flax

Transfer learning with JAX & Flax

Training large neural networks can take days or weeks. Once these networks are trained, you can take advantage of their weights and apply them to new tasks– transfer learning. As a result, you fine-tune a new network and get good results in a short period. Let's look at

Flax vs. TensorFlow

Flax vs. TensorFlow

Flax is the neural network library for JAX. TensorFlow is a deep learning library with a large ecosystem of tools and resources. Flax and TensorFlow are similar but different in some ways. For instance, both Flax and TensorFlow can run on XLA. Let's look at the differences between

Activation functions in JAX and Flax

Activation functions in JAX and Flax

Activation functions are applied in neural networks to ensure that the network outputs the desired result. The activations functions cap the output within a specific range. For instance, when solving a binary classification problem, the outcome should be a number between 0 and 1. This indicates the probability of an

Optimizers in JAX and Flax

Optimizers in JAX and Flax

Optimizers are applied when training neural networks to reduce the error between the true and predicted values. This optimization is done via gradient descent. Gradient descent adjusts errors in the network through a cost function. In JAX, optimizers are applied from the Optax library. Optimizers can be classified into two

Elegy(High-level API for deep learning in JAX & Flax)

Elegy(High-level API for deep learning in JAX & Flax)

Training deep learning networks in Flax is done in a couple of steps. It involves creating the following functions: * Model definition. * Compute metrics. * Training state. * Training step. * Training and evaluation function. Flax and JAX give more control in defining and training deep learning networks. However, this comes with more verbosity.