{ "cells": [ { "cell_type": "markdown", "id": "0ae4c48e", "metadata": {}, "source": [ "![Training Scientists](https://www.training-scientists.de/wp-content/uploads/2021/07/training_scientists.webp)" ] }, { "cell_type": "markdown", "id": "55e48c41", "metadata": {}, "source": [ "# Python for Beginners using ChatGPT & Claude" ] }, { "cell_type": "markdown", "id": "a94c36ff-d17f-4828-adaf-5d624db2e11d", "metadata": {}, "source": [ "## Basics\n", "By the end of this section, you will be able to:\n", "1. Understand the concept of **Integrated Development Environments (IDEs)**\n", "2. Know how **AI tools** can help you learn faster but also realize their limitations\n", "3. Differentiate between Python **scripts** and Jupyter **Notebooks**\n", "4. Write and execute a basic **\"Hello World\"** program\n", "5. Recognize the importance of **code readability and PEP 8** guidelines\n", "\n", "### Comparison of Python, MATLAB, C++, and R\n", "\n", "| **Aspect** | Python | MATLAB | C++ | R |\n", "|------------|--------|--------|-----|---|\n", "| **Learning Curve** | Easy, clean syntax **(Most beginner-friendly)** | Moderate, good for math background | Steep, requires low-level understanding | Moderate, can be challenging for beginners |\n", "| **Performance** | Slower than C++, optimizable with NumPy | Good for matrix operations | Typically fastest | Can be slow for large datasets, optimized for stats |\n", "| **Use Cases** | General-purpose, web dev, data science, AI/ML **(Most Versatile)** | Engineering, scientific computing, signal processing | System/software dev, games, resource-intensive apps | Statistical computing, data analysis, bioinformatics |\n", "| **Data Analysis & Visualization** | Strong libraries (Pandas, Matplotlib) | Excellent built-in capabilities | Limited built-in, needs external libraries | Excellent tools (e.g., ggplot2) |\n", "| **Community & Ecosystem** | Large, active, vast library ecosystem **(Largest community)** | Smaller, strong in academia/engineering | Large, extensive domain libraries | Strong in statistics and data science |\n", "| **Cost** | **Free**, open-source | Proprietary, licensed (expensive) | Free compilers, some paid IDEs | Free, open-source |\n", "| **Language Integration** | Easy with C/C++ and others **(Excellent interoperability)** | Can integrate with C/C++, Java, Python | Integrates with most languages | Can integrate with C/C++ and Python |\n", "| **ML & AI Support** | Excellent (TensorFlow, PyTorch, scikit-learn) **(Leader in AI/ML tools)** | Good support, less extensive than Python | Used for low-level ML and optimization | Good for statistical learning, less for deep learning |\n", "\n", "I focus mainly on the science and engineering applications of Python but e.g. \n", "\n", "- **Netflix** uses Python extensively for its recommendation engine, data analysis, and backend services.\n", "- **YouTube** uses Python for video sharing and viewing functionality\n", "- **Instagram** uses Python (Django framework) for its backend\n", "- **BitTorrent** used Python for the original BitTorrent client\n", "\n", "### Integrated Development Environments\n", "\n", "I recommend either using Jupyter Lab Desktop (available for all operating systems) or Anaconda Cloud which doesn't require installation.\n", "Jupyterlab Desktop will run faster, however Anaconda Cloud has Anaconda AI Assistant built in for free. And if you are on a university computer where you can't install anything Anaconda Cloud is a good choice.\n", "\n", "If you want to know about the other options for IDEs and why I decided to choose Jupyterlab for this and my other courses check out these two videos:\n", "\n", "+ [13 Beginner-Friendly Python IDEs Compared in 2024: Jupyter Lab, VS Code, PyCharm, Wing, Zed and More](https://youtu.be/6lj-Mv25eWs)\n", "

\n", "\n", "

\n", "\n", "+ [Choosing the Best Beginner Friendly Python IDE in 2024: VS Code vs. JupyterLab vs. Anaconda Cloud](https://youtu.be/U41WhFaggtA)\n", "

\n", "\n", "

\n", "\n", "\n", "### Jupyter Lab\n", "For a detailed video about the installation of JupyterLab Desktop check out this video:\n", " + [Jupyter Lab Desktop: Installation, Configuration, and Best Practices for Windows & Mac](https://youtu.be/Q5li7FMUKEk)\n", "

\n", "\n", "

" ] }, { "attachments": {}, "cell_type": "markdown", "id": "99b5e10f-c532-4594-a8d8-3e96a975c96f", "metadata": {}, "source": [ "### Line Width and Limiter Lines (PEP8)\n", "\n", "In this notebook, you'll notice two limiter lines: one at 80 characters and another at 100 characters. These lines relate to an important aspect of Python coding style.\n", "\n", "**PEP 8: The Style Guide for Python Code**\n", "\n", "PEP 8 is the official style guide for Python code. It provides guidelines to improve code readability and consistency across the Python community. One key recommendation concerns line length:\n", "\n", "> 🔍 **Guideline**: Keep lines of code between 79-99 characters long.\n", "\n", "**Why Limit Line Length?**\n", "\n", "1. **Readability**: Shorter lines are easier to read and understand.\n", "2. **Side-by-Side Viewing**: Allows multiple files to be open side-by-side.\n", "3. **Printing**: Ensures code prints well on standard paper or small screens.\n", "\n", "**Example**\n", "```python\n", "# This is a very long line of code that exceeds the recommended 79-character limit and might be hard to read\n", "result = some_long_function_name(first_long_parameter_name, second_long_parameter_name, third_long_parameter_name)\n", "\n", "# Better: Split into multiple lines\n", "result = some_long_function_name(\n", " first_long_parameter_name,\n", " second_long_parameter_name,\n", " third_long_parameter_name\n", ")\n", "```\n", "**More on Pep8** and good programming principles in the **advanced courses**:\n", "+ [Python Basics](https://training-scientists.com/python-basics-course/)\n", "+ [Python for Scientists & Engineers](https://training-scientists.com/python-for-scientists-and-engineers/)\n", "+ [Python for Biologists](https://training-scientists.com/python-for-biologists/)\n", "\n", " > You can **download** this Jupyter Notebook from the video description (as PDF or as a Jupyter Notebook).\n", "\n", " > There are exercises that you can get on my course website https://training-scientists.com (Python Beginner Course using AI)" ] }, { "cell_type": "markdown", "id": "0ae1a03e-8dae-45e5-ac38-b9f8edcd7fd8", "metadata": {}, "source": [ "### Anaconda Cloud\n", "For Anaconda Cloud \n", "> https://anaconda.cloud\n", "\n", "no installation is necessary, you can just create an Account on their website and start coding.\n", "\n", "While Jupyter Notebooks run perfectly, running Python scripts with graphical output does not work." ] }, { "cell_type": "markdown", "id": "53867bc4-c5f5-4ae1-8499-053a5ee7eb69", "metadata": {}, "source": [ "### AI Tools\n", "\n", "We will use ChatGPT, Claude and Anaconda Assistant in this course to help you learn programming faster.\n", "\n", "AI Tools are great at explaining code and concepts so you can 2X your learning curve.\n", "\n", "> To use Claude go to https://claude.ai create an account and start chatting with it.\n", "\n", "> To use ChatGPT do the same on: https://chatgpt.com\n", "\n", "E.g. Ask Claude 🤖💬:\n", "> 1. `Can you tell me how Python compares to Matlab, C++ and R?`\n", "> \n", "> 2. `Can you reformat that into a visually appealing table that I can copy paste into a Jupyter Notebook markdown cell?`\n", "\n", "Using AI tools is not cheating. Cars will look like cheating for someone who sells horse carriages. Or dinosaurs who don't want to learn something new.\n", "\n", "The code AI tools generate is not always working so we still need to learn programming ourselves.\n", "If you want to know more check out:\n", "\n", "+ [Can Claude 3.5 | ChatGPT 4o | GitHub Copilot build Snake & Electron Cloud simulation in Python? #GPT](https://youtu.be/m8YKaG4-_x8)\n", "

\n", "\n", "

\n", "\n", "\n", "+ [GitHub Copilot: Accelerating Coding or False Hope? | Reaction Video](https://youtu.be/vgsNdaxnXlE)\n", "

\n", "\n", "

\n", "\n", "+ [Debunking AI Myths: My Reaction to 'Why is everyone LYING?](https://youtu.be/KVLNA2231U4)\n", "

\n", "\n", "

\n", "\n" ] }, { "cell_type": "markdown", "id": "6413de94-9c67-4826-8834-2185b5d99360", "metadata": {}, "source": [ "### Python Scripts vs Jupyter Notebooks\n", "\n", "Scripts always run completely top to bottom, so if there is an error somewhere in the end you will need to change the code and run everything again. \n", "\n", "Whereas Notebooks you can run code cell by cell (line by line if you want to).\n", "This makes debugging and overall development a lot faster.\n", "+ You can show multiple plots, add text like this and structure the Notebook with a Table of Contents\n", "\n", "+ Jupyter Notebooks allow you to structure your code with markdown cells, headings etc.\n", "\n", "+ Scripts are better though if you want to run games (like snake) or simulations with a video like output\n", "\n", "+ Jupyter Notebooks have a lot of advantages but also some pitfalls like cell state we will look at later" ] }, { "cell_type": "markdown", "id": "a81c3b33-16a8-488b-a1aa-fe93127f49da", "metadata": {}, "source": [ "### Hello World (Jupyter) 👋\n", "This would not be a programming tutorial without a Hello World script" ] }, { "cell_type": "code", "execution_count": 1, "id": "fbd1ebd3-9d8f-4f2a-9ec6-c61593315156", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello World\n" ] } ], "source": [ "print(\"Hello World\")" ] }, { "cell_type": "markdown", "id": "59cc6c0f-15f9-41d4-acc6-4f321b058c45", "metadata": {}, "source": [ "### Hello World (Script)" ] }, { "cell_type": "markdown", "id": "70203aa5", "metadata": {}, "source": [ "## Variables & Data Types 🏷️\n", "By the end of this section, you will be able to:\n", "1. Define and use **variables** in Python\n", "2. Identify and work with different **data types** (int, float, string, boolean)\n", "3. Understand and use **f-strings** for string formatting\n", "4. Create and manipulate **lists, tuples, and dictionaries**\n", "5. Recognize the **appropriate use cases** for different data structures" ] }, { "cell_type": "code", "execution_count": 2, "id": "4f4f950d-8737-4774-9a75-041ec5517549", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.154556Z", "start_time": "2022-03-08T14:59:03.140558Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello World\n", "16\n", "1.3\n", "(5+3j)\n", "True\n" ] } ], "source": [ "z = \"Hello World\" # string\n", "x = 16 # integer\n", "u = 1.3 # float\n", "complex_number = 5+3j # complex\n", "on_or_off = True # boolean\n", "\n", "print(z)\n", "print(x)\n", "print(u)\n", "print(complex_number)\n", "print(on_or_off)" ] }, { "cell_type": "code", "execution_count": 3, "id": "a18384a6-e21f-4f79-8956-78d2111fe940", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.200558Z", "start_time": "2022-03-08T14:59:03.188556Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# Use type() function to give you variable type\n", "print(type(complex_number))" ] }, { "cell_type": "markdown", "id": "95832cfe-fe68-4daf-8dd9-113a4a7d26de", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "d0a051c6-8061-4574-a119-d5d6e1cbe047", "metadata": {}, "source": [ "### Strings\n", "**Let's look at an example to understand what variables are what they are useful for**" ] }, { "cell_type": "code", "execution_count": 4, "id": "15473edd-a09b-4277-a4ba-e74fa835d650", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tim is 4 and loves to play.\n", "He builds with blocks every day.\n", "4-year-old Tim stacks them high,\n", "Tim's towers almost touch the sky.\n" ] } ], "source": [ "print(\"Tim is 4 and loves to play.\")\n", "print(\"He builds with blocks every day.\")\n", "print(\"4-year-old Tim stacks them high,\")\n", "print(\"Tim's towers almost touch the sky.\")" ] }, { "cell_type": "markdown", "id": "31ec6996-460c-4983-a426-1fa5ffed33d4", "metadata": {}, "source": [ "\n", "**What if we want to change the name or the age though? We would have to change it in multiple places manually**" ] }, { "cell_type": "code", "execution_count": 5, "id": "6faada13-347c-4edf-ab6a-b49f494be11e", "metadata": {}, "outputs": [], "source": [ "name = \"Max\"\n", "age = 4" ] }, { "cell_type": "code", "execution_count": 6, "id": "f6be9a84-dc09-4c78-8f5f-08cd96543c25", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Max is 4 and loves to play.\n", "He builds with blocks every day.\n", "4-year-old Max stacks them high,\n", "Max's towers almost touch the sky.\n" ] } ], "source": [ "# We can use f-strings to insert our variables into the text:\n", "print(f\"{name} is {age} and loves to play.\")\n", "print(f\"He builds with blocks every day.\")\n", "print(f\"{age}-year-old {name} stacks them high,\")\n", "print(f\"{name}'s towers almost touch the sky.\")" ] }, { "cell_type": "markdown", "id": "10a37f80-7117-4625-81dc-23c78fff8b7e", "metadata": {}, "source": [ "**We do need to execute both cells for the output to update**\n", "\n", "**f-strings are useful e.g. when creating file names and you want to store variable data (like a temperature) to the filename**\n", "\n", "**more on f-strings in the advanced courses**" ] }, { "cell_type": "code", "execution_count": 7, "id": "39a95ce1-228d-4366-8b6a-df411d388762", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.268251Z", "start_time": "2022-03-08T14:59:04.258250Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I am a cheetah\n" ] } ], "source": [ "# Single and double quoted strings are the same\n", "text = 'I am a cheetah'\n", "print(text)" ] }, { "cell_type": "code", "execution_count": 8, "id": "42b472ab-d7c1-41ff-8810-c32e4b62555b", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.268251Z", "start_time": "2022-03-08T14:59:04.258250Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tim's towers almost touch the sky.\n" ] } ], "source": [ "# with double quoted strings you can still use apostrophes inside of the string\n", "print(\"Tim's towers almost touch the sky.\")" ] }, { "cell_type": "markdown", "id": "f29388b6-c615-4cf8-b783-052f653b8fd1", "metadata": {}, "source": [ "### Integer variables & Basic Math" ] }, { "cell_type": "code", "execution_count": 9, "id": "fb9675c4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5\n" ] } ], "source": [ "a = 5\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 10, "id": "31acbac7", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# instead of using the print statement, we can just use the variable name to print\n", "# This works for only one variable per cell though\n", "b = 3\n", "b" ] }, { "cell_type": "code", "execution_count": 11, "id": "fecce215", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c = a + b\n", "c" ] }, { "cell_type": "markdown", "id": "04282e4e", "metadata": {}, "source": [ "### Cell state (Jupyter Notebook pitfall)" ] }, { "cell_type": "code", "execution_count": 12, "id": "567e3973-7d3b-4a5f-81a4-51a65c27195f", "metadata": {}, "outputs": [], "source": [ "x = 5\n", "y = 2" ] }, { "cell_type": "code", "execution_count": 13, "id": "ed4f810c-b9f3-4e60-abdc-c5516ea3770b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If the cells are executed top to bottom everything is working\n", "# If you execute this cell without the previous one Python will not know what x and y are\n", "z = x + y\n", "z" ] }, { "cell_type": "markdown", "id": "d7762ce6-1efe-4939-a664-c424a4fa5ee0", "metadata": {}, "source": [ "One common mistake I see beginners do is name all their variables x and y which leads to different results depending on the order in which cells are executed." ] }, { "cell_type": "code", "execution_count": 14, "id": "52c380c5-ac58-4eba-b916-69500c48f289", "metadata": {}, "outputs": [], "source": [ "x = 10\n", "y = 15" ] }, { "cell_type": "markdown", "id": "77f06c0e-f525-4a14-b1fa-fcbe6efa0a17", "metadata": {}, "source": [ "This is still an issue even if you delete the cell because the variables you define are kept in memory.\n", "\n", "You need to restart the kernel to clear the memory and rerun the cells to add the variables that you do want to memory again." ] }, { "cell_type": "markdown", "id": "47217094-610c-4d2c-addd-a0d54345f19e", "metadata": {}, "source": [ "### Comments\n", "\n", "You see me use comments throughout the Notebook to add context, clarify things and for explanations.\n", "\n", "Comments are text that Python ignores when running code. They make your code more readable and understandable.\n", "\n", "Start with `#`\n", "```python\n", " # This is a single-line comment\n", " x = 5 # Comment at the end of a line\n", "```\n", "\n", "Best Practices for Using Comments\n", "\n", "+ **Be Clear and Concise**: Write comments that are easy to understand and to the point.\n", "+ **Update Comments**: Always update comments when you change your code to avoid misleading information.\n", "+ **Avoid Obvious Comments**: Don't state the obvious. Focus on explaining 'why' rather than 'what'.\n", "\n", "```python\n", " # Bad: Increment x by 1\n", " x += 1\n", "\n", " # Good: Increment age after birthday\n", " age += 1\n", "```\n", "+ Use Comments for Complex Logic: **Explain** tricky, non-obvious, or important parts of your code.\n", "\n", "+ **Code Sectioning**: Use comments to divide your code into logical sections. (in Jupyter you can use markdown cells and headings for that)\n", "```python\n", "# Data Preprocessing\n", "...\n", "\n", "# Model Training\n", "...\n", "\n", "# Results Analysis\n", "...\n", "```\n", "+ **TODO Comments**: Mark areas that need future work.\n", "\n", "```python\n", "# TODO: Implement error handling for invalid inputs\n", "```\n", "\n", "Remember: While comments are important, **clear and self-explanatory code is even better**. \n", "Use **descriptive variable and function names** to reduce the need for excessive commenting." ] }, { "cell_type": "markdown", "id": "b1994a76-cfde-44c6-8682-8cc7e1b1c72b", "metadata": {}, "source": [ "### Tuples 🔗\n", "\n", "Tuples cannot be changed after creation, so they are constants" ] }, { "cell_type": "code", "execution_count": 15, "id": "84945fd0-d841-4f2a-b3be-b813dc03ab61", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(5, 2)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create by using round brackets ()\n", "coordinates = (5, 2)\n", "coordinates" ] }, { "cell_type": "code", "execution_count": 16, "id": "f1c0f23c-981a-41be-9b2d-8eb0c2e3812c", "metadata": {}, "outputs": [], "source": [ "# This will not work:\n", "#coordinates[1] = 5" ] }, { "cell_type": "markdown", "id": "3aa15002-ca5c-48bc-a85a-222ce1f6d334", "metadata": {}, "source": [ "If a function (section 5) has more than 1 return value it will be returned as a tuple" ] }, { "cell_type": "markdown", "id": "4a558903", "metadata": {}, "source": [ "### Lists 📝" ] }, { "cell_type": "code", "execution_count": 17, "id": "7018e37f", "metadata": {}, "outputs": [], "source": [ "temp1 = 5\n", "temp2 = 7\n", "temp3 = 10" ] }, { "cell_type": "code", "execution_count": 18, "id": "992c7a95", "metadata": {}, "outputs": [], "source": [ "# Create a list by using square brackets []\n", "temp = [5, 7, 10]" ] }, { "cell_type": "code", "execution_count": 19, "id": "98d99848", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[5, 7, 10]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp" ] }, { "cell_type": "code", "execution_count": 20, "id": "ab0f544a-412c-43a9-9c72-a8d2110583a5", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.339557Z", "start_time": "2022-03-08T14:59:03.327556Z" } }, "outputs": [ { "data": { "text/plain": [ "[5, 7, 10, 15]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Appending\n", "\n", "temp.append(15) # add 15 to list at the end\n", "temp" ] }, { "cell_type": "code", "execution_count": 21, "id": "77065424-1cb0-49d9-972c-b8efa7d651e2", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.278556Z", "start_time": "2022-03-08T14:59:03.271553Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 'x', 4, 6, 8]\n" ] } ], "source": [ "# Mixing of datatypes is possible:\n", "\n", "testlist_1 = [1,'x',4,6,8] \n", "\n", "print(testlist_1)" ] }, { "cell_type": "code", "execution_count": 22, "id": "826c21a3-7185-4984-8148-bcccba138382", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.308556Z", "start_time": "2022-03-08T14:59:03.288558Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x\n", "[1, 'x']\n", "[4, 6, 8]\n" ] } ], "source": [ "# Indexing, Slicing\n", "\n", "print(testlist_1[1]) # Use indexing, in python indices start at 0\n", "\n", "print(testlist_1[0:2]) # Use slicing, :2 means 'until but not including 2'\n", "\n", "print(testlist_1[2:]) # 2: means from index 2 until the end" ] }, { "cell_type": "markdown", "id": "116585e2-df51-407d-9350-84be44ec8bf3", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": 23, "id": "b8902e46-ace6-499c-b696-4af671017dae", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.323553Z", "start_time": "2022-03-08T14:59:03.312556Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 'x', 4, 6, 8, 'boat', 42, 39.9, 'x']\n" ] } ], "source": [ "# Concatenating\n", "\n", "testlist_2 = ['boat', 42, 39.9, 'x']\n", "\n", "merged_list = testlist_1 + testlist_2\n", "print(merged_list)" ] }, { "cell_type": "code", "execution_count": 24, "id": "088d8ea7-2fb1-43dc-82e3-f6df7aad78b1", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.354554Z", "start_time": "2022-03-08T14:59:03.344556Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 4, 6, 8, 'boat', 42, 39.9, 'x']\n" ] } ], "source": [ "# Remove by value\n", "merged_list.remove('x') # remove 'x'\n", "print(merged_list)" ] }, { "cell_type": "code", "execution_count": 25, "id": "df174241-119a-4399-995c-9e0a20e4a987", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.369557Z", "start_time": "2022-03-08T14:59:03.358558Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 6, 8, 'boat', 42, 39.9, 'x']\n" ] } ], "source": [ "# Remove by index\n", "merged_list.pop(1) # remove item with index 1\n", "print(merged_list)" ] }, { "cell_type": "code", "execution_count": 26, "id": "75f6b965-4d29-42fb-82bd-46af3f8c7dba", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.384557Z", "start_time": "2022-03-08T14:59:03.374555Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['boat', 'car', 'cow', 'house', 'pig']\n" ] } ], "source": [ "# Sorting\n", "list_of_strings = ['car','house','boat','cow', 'pig']\n", "list_of_strings.sort() # Can use sort to sort alphabetically or numerically\n", "print(list_of_strings)" ] }, { "cell_type": "markdown", "id": "e413631a-bba4-4da7-88fc-257587888568", "metadata": {}, "source": [ "We will mostly be using numpy arrays and pandas dataframes in the advanced courses\n", "\n", "so if you want to know more about lists, \n", "> ask Claude 🤖💬: `Tell me about python lists and everything I can do with them`" ] }, { "cell_type": "markdown", "id": "e97983fc-2eac-4459-a49f-0c3e991f78b4", "metadata": {}, "source": [ "### Dictionaries 📖\n", "\n", "A dictionary in Python is a collection of key-value pairs. Each key is unique, and it is associated with a value. You can think of a dictionary as a real-world dictionary where you look up a word (the key) and find its definition (the value).\n", "\n", "Dictionaries are created using curly braces {} and the key-value pairs are separated by a colon :" ] }, { "cell_type": "code", "execution_count": 27, "id": "4a625d35-7728-487e-8440-c7c63109ab89", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Alice\n", "30\n", "+49 178 12345\n" ] } ], "source": [ "# Creating a dictionary to store a person's details\n", "person = {\n", " \"name\": \"Alice\",\n", " \"age\": 30,\n", " \"phone\": \"+49 178 12345\"\n", "}\n", "\n", "# Accessing values using keys\n", "print(person[\"name\"])\n", "print(person[\"age\"])\n", "\n", "# alternatively use the .get() function which defaults to none if the key doesn't exist\n", "print(person.get(\"phone\"))" ] }, { "cell_type": "markdown", "id": "18d44ec3-c453-40e9-9482-0fa4f940e6d6", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "92837146-b060-4a73-8bdd-18f80120b8b9", "metadata": {}, "source": [ "**Dictionaries are for example useful when we are plotting (more later)**" ] }, { "cell_type": "markdown", "id": "d54d6519-619b-4eaf-862a-73b38d53d75e", "metadata": {}, "source": [ "> Ask Claude 🤖💬: `What are the main differences between lists, tuples, and dictionaries in Python?`" ] }, { "cell_type": "markdown", "id": "6f174bc6-7263-46ec-bc20-6b1f17b16952", "metadata": { "jp-MarkdownHeadingCollapsed": true }, "source": [ "---\n", "### Why Use Dictionaries?\n", "\n", "Dictionaries are particularly useful when:\n", "\n", "- **You need to look up values by a unique key.** \n", " For example, if you have an employee ID and you want to quickly find the corresponding employee name, a dictionary is ideal.\n", " \n", "- **You want to store data with named properties.** \n", " This is common in cases where you have related information (like user data) and need to access parts of it frequently.\n", "\n", "---\n", "\n", "### Comparison of Data Structures\n", "\n", "*Here's how dictionaries differ from other data structures:*\n", "\n", "- **Lists:**\n", " - Lists are ordered collections of items that are accessed by their index (a position number starting from 0).\n", " - Useful when the order of elements is important or you want to store multiple items of the same type.\n", " - *Example*: `my_list = [1, 2, 3, 4]`\n", " - **Dictionaries are better** when you need named access to items rather than indexed access.\n", "\n", "- **Tuples:**\n", " - Tuples are similar to lists, but they are immutable (cannot be changed after creation).\n", " - Useful for fixed collections of items.\n", " - *Example*: `my_tuple = (1, 2, 3)`\n", " - **Dictionaries provide more flexibility** as they allow for dynamic modifications (adding/removing key-value pairs).\n", "\n", "- **NumPy Arrays:** (later)\n", " - NumPy arrays are specialized for numerical data and mathematical operations. They offer fast processing for large amounts of numerical data.\n", " - *Example*: `np.array([1, 2, 3, 4])`\n", " - **Dictionaries are better** for mixed data types (like strings and numbers) and quick lookups by key.\n", "\n", "- **Pandas DataFrames:** (in advanced course)\n", " - DataFrames are 2-dimensional tabular data structures in the pandas library. They are great for handling and analyzing structured data.\n", " - *Example*: `pd.DataFrame({\"Name\": [\"Alice\", \"Bob\"], \"Age\": [30, 25]})`\n", " - **Dictionaries are simpler and more lightweight** for cases where you just need a quick lookup table or small, unstructured data.\n", "\n", "---\n", "\n", "> **Note**: **Lists []** use square brackets, **Tuples ()** use round brackets, **dictionaries {}** use curly brackets.\n", "> \n", "> For numpy arrays it depends whether we convert a list to an array or if we create an array from scratch (more later)" ] }, { "cell_type": "markdown", "id": "8551535d-92b4-4f81-a0f8-18ca00f75acc", "metadata": {}, "source": [ "## If statements 🔀\n", "By the end of this section, you will be able to:\n", "1. Write basic conditional statements using **if, elif, and else**\n", "2. Use **comparison operators** (==, !=, <, >, <=, >=) in conditional statements\n", "3. Combine conditions using logical operators **(and, or, not)**\n", "4. Understand and apply **boolean logic** in programming contexts" ] }, { "cell_type": "markdown", "id": "9e448f18-40c9-43e7-ad03-1407f4f63c5b", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "200eea84-5495-4c0e-be9a-ce2234069232", "metadata": {}, "source": [ "### Indentation" ] }, { "cell_type": "code", "execution_count": 28, "id": "dc419a1d-faef-404a-a3eb-251b76bc3e8e", "metadata": { "ExecuteTime": { "end_time": "2022-03-14T03:33:57.587415Z", "start_time": "2022-03-14T03:33:57.571747Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "i is 0\n" ] } ], "source": [ "# Indentation indicates blocks of code\n", "\n", "i = 0\n", "if (i==0):\n", " print('i is 0')\n", "else:\n", " print('i is not 0')" ] }, { "cell_type": "code", "execution_count": 29, "id": "b6859e92-22a2-45ce-ba24-abf636f3b255", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.076553Z", "start_time": "2022-03-08T14:59:03.062553Z" } }, "outputs": [], "source": [ "# Arbitrary use of indentation creates error\n", "\n", "f=1\n", "# This will not work:\n", " #g=1" ] }, { "cell_type": "markdown", "id": "cb175444-a7e9-40b5-8045-912c852e1fd4", "metadata": {}, "source": [ "If statements control the program flow: \n", "\n", "If you only want parts of the code executed in case a certain condition is met then use if statements." ] }, { "cell_type": "code", "execution_count": 30, "id": "5c738542-8cd8-4cb6-935e-fde5df3cec4f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Enjoy the sunny day!\n" ] } ], "source": [ "# The if statement checks for a boolean variable (True/False)\n", "# We can define it beforehand:\n", "\n", "is_sunny = True\n", "\n", "if is_sunny:\n", " print(\"Enjoy the sunny day!\")\n", "else:\n", " print(\"Don't forget your umbrella!\")" ] }, { "cell_type": "markdown", "id": "789c1148-8967-4838-922e-d39ddd3d055d", "metadata": {}, "source": [ "### \"and\" operator" ] }, { "cell_type": "code", "execution_count": 31, "id": "f6b1b2b5-724d-41d7-b7fc-43474514a807", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "It is either not sunny or not warm or neither\n" ] } ], "source": [ "# \"and\" to check for multiple conditions\n", "is_sunny = False\n", "is_warm = False\n", "\n", "if is_sunny and is_warm:\n", " print(\"Enjoy the warm, sunny day. Take sunglasses.\")\n", "else:\n", " print(\"It is either not sunny or not warm or neither\")" ] }, { "cell_type": "markdown", "id": "94ff5689-dead-43d9-8b44-dee6c1a76e95", "metadata": {}, "source": [ "### \"or\" operator" ] }, { "cell_type": "code", "execution_count": 32, "id": "9a5c0ea7-054c-4483-9207-44dd1b82b3b3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "It is neither sunny nor warm. Just stay home\n" ] } ], "source": [ "# \"or\" to check for multiple conditions:\n", "is_sunny = False\n", "is_warm = False\n", "\n", "if is_sunny or is_warm:\n", " print(\"It is either sunny or warm or both\")\n", "else:\n", " print(\"It is neither sunny nor warm. Just stay home\")" ] }, { "cell_type": "markdown", "id": "6aaaa3db-893d-4385-a309-26e1066df5a5", "metadata": {}, "source": [ "### Boolean Logic Weather example\n", "\n", "Let's explore how boolean variables work using two weather conditions: sunniness and warmth.\n", "\n", "**Key:**\n", "- X means True (Yes)\n", "- O means False (No)\n", "\n", "**All Possible Combinations:**\n", "\n", "| Weather Condition | Case 1 | Case 2 | Case 3 | Case 4 |\n", "|-------------------|--------|--------|--------|--------|\n", "| Is it sunny? | X | X | O | O |\n", "| Is it warm? | X | O | X | O |\n", "\n", "**Logical Operations:**\n", "\n", "| Operation | Case 1 | Case 2 | Case 3 | Case 4 | Explanation |\n", "|----------------------------|--------|--------|--------|--------|------------------------------------|\n", "| Is it sunny AND warm? | X | O | O | O | True only when both are true |\n", "| Is it sunny OR warm? | X | X | X | O | True if at least one is true |" ] }, { "cell_type": "markdown", "id": "de967991-ef91-4239-8033-3d55da56b725", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "e7eb1223-04ce-4442-84c8-d0829e663c7c", "metadata": {}, "source": [ "### elif" ] }, { "cell_type": "code", "execution_count": 33, "id": "bbcf7c0b-c955-4243-a628-f305fc38bac2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "It is not sunny but it is warm. Take an umbrella\n" ] } ], "source": [ "# Let's catch all four cases:\n", "is_sunny = False\n", "is_warm = True\n", "\n", "if is_sunny and is_warm: # case 1\n", " print(\"Enjoy the warm, sunny day. Take sunglasses.\")\n", "elif is_sunny and not(is_warm): # case 2\n", " print(\"It is sunny but it isn't warm. Take a jacket\")\n", "elif is_warm: # case 3\n", " print(\"It is not sunny but it is warm. Take an umbrella\")\n", "else: # case 4\n", " print(\"It is neither sunny nor warm. Just stay home\")\n", "\n", "# for case 3 there is no need to check again whether it is sunny because \n", "# we already checked the 2 cases where it is sunny" ] }, { "cell_type": "markdown", "id": "a8c90457-a0e7-497e-a075-b807744fe43e", "metadata": {}, "source": [ "> Ask Claude 🤖💬: `Explain this code cell to me. How do these if statements work? Why did we not have to check again for is_sunny in case 3?`" ] }, { "cell_type": "markdown", "id": "12da5b0f-3f79-4c0a-8433-1e508631d9cd", "metadata": {}, "source": [ "### Creating booleans by comparison\n", "Comparing two numbers with `>` `<` `==` `!=` `<=` `>=` creates booleans " ] }, { "cell_type": "code", "execution_count": 34, "id": "dfb800f3-3fcd-41ca-827d-526948228375", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n", "False\n" ] } ], "source": [ "print(1 > 0)\n", "print(1 < 0)\n", "print(1 == 0)" ] }, { "cell_type": "code", "execution_count": 35, "id": "bf579e61-c371-4a9b-ab9e-25c6e923cd40", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "i greater than 0\n" ] } ], "source": [ "# You can create booleans by comparison:\n", "\n", "i = 5\n", "if (i==0):\n", " print('i is 0')\n", "elif(i < 0):\n", " print('i is smaller than 0')\n", "else:\n", " print('i greater than 0')" ] }, { "cell_type": "markdown", "id": "72bec9a2-dee8-4d65-8c4e-9aeecb77944b", "metadata": {}, "source": [ "---\n", "As a beginner, it's enough to understand the following concepts:\n", "\n", "1. **Boolean Variables**: What they are and how they work\n", "2. **Conditional Statements**: The syntax and logic behind `if`... `elif`... `else`\n", "3. **Logical Operators**: Combining multiple conditions with:\n", " - `and`\n", " - `or`\n", " - `not(..)`\n", "4. **Comparison Operators**: Used to compare values\n", " - `==` (equal to)\n", " - `!=` (not equal to)\n", " - `<` (less than)\n", " - `>` (greater than)\n", " - `<=` (less than or equal to)\n", " - `>=` (greater than or equal to)\n", "\n", "---\n", "\n", "🚀 This foundational knowledge is all you need for the advanced courses:\n", "\n", "- [Python Basics](https://training-scientists.com/python-basics-course/)\n", "- [Python for Scientists & Engineers](https://training-scientists.com/python-for-scientists-and-engineers/)\n", "- [Python for Biologists](https://training-scientists.com/python-for-biologists/)\n", "\n", "\n", "---" ] }, { "cell_type": "markdown", "id": "d73d3712-7a61-4983-88a8-f6726acd989a", "metadata": {}, "source": [ "## Functions 🧮\n", "By the end of this section, you will be able to:\n", "1. Define and explain the **purpose of functions** in Python\n", "2. **Create functions** using the `def` keyword\n", "3. Understand the concept of function **parameters and return values**\n", "4. Call functions and use their **return values**\n", "5. Explain the difference between **local and global scope** in functions\n", "6. Recognize and apply **best practices** in function naming and design" ] }, { "cell_type": "code", "execution_count": 36, "id": "fd2cabaf-62a5-414c-98fa-e244ec29ec90", "metadata": {}, "outputs": [], "source": [ "# To define a function we need to use the \"def\" specifier, the name of the function\n", "# and in parenthesis the input arguments + a colon at the end\n", "# all the code that will be inside the funtion needs to be indented\n", "\n", "def hello_world():\n", " print(\"Hello World\")" ] }, { "cell_type": "code", "execution_count": 37, "id": "8caa8176-374b-4edf-a6fc-0fe9e2c03a90", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello World\n" ] } ], "source": [ "# We need to call the function for something to happen\n", "hello_world()" ] }, { "cell_type": "markdown", "id": "d000de7b-d931-41ad-a0e4-8903ea581ab9", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": 38, "id": "6d337103-41c6-42d4-aad3-d582b97a7d12", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.462552Z", "start_time": "2022-03-08T14:59:03.448558Z" } }, "outputs": [], "source": [ "def square_function(x):\n", " return x**2" ] }, { "cell_type": "code", "execution_count": 39, "id": "ac9fab0f-cadc-4083-8b78-ad620afd94f1", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:03.477556Z", "start_time": "2022-03-08T14:59:03.464552Z" } }, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "square_function(3)" ] }, { "cell_type": "code", "execution_count": 40, "id": "7d9db7dc-4646-4f64-ada8-bcfbcf612270", "metadata": {}, "outputs": [], "source": [ "# We can save the return value of the function in another variable:\n", "\n", "result = square_function(3)" ] }, { "cell_type": "code", "execution_count": 41, "id": "ec65b963-7272-47b1-b625-0c4544828690", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "code", "execution_count": 42, "id": "62dad2fa-bf69-4445-8731-15f11fc02bee", "metadata": {}, "outputs": [], "source": [ "# A function can have multiple input values:\n", "def multiple_input(x,y):\n", " return x**2 + y**2" ] }, { "cell_type": "code", "execution_count": 43, "id": "8f7b4af6-71ae-441c-b7c9-563b7a0b8005", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "13" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "z = multiple_input(2,3)\n", "z" ] }, { "cell_type": "markdown", "id": "775283c2-8d95-4ce4-955a-9817eda94004", "metadata": {}, "source": [ "> \"Ask ChatGPT: Can you explain the concept of return values in functions in Python and why they're important?\"" ] }, { "cell_type": "markdown", "id": "0302f39f-1405-47d2-bc98-099f1df1222f", "metadata": {}, "source": [ "### Why Use Functions?\n", "\n", "Functions are fundamental building blocks in programming that offer several advantages:\n", "\n", "1. **Organize Code**: Functions help structure your program into logical, manageable chunks.\n", "\n", "2. **Avoid Repetition**: Instead of copy-pasting code, functions allow you to reuse code efficiently.\n", "\n", "3. **Enhance Readability**: Well-named functions make your code self-documenting and easier to understand.\n", "\n", "4. **Improve Maintainability**: When code is organized into functions, it's easier to update and debug.\n", "\n", "> 💡 **Pro Tip**: Whenever you find yourself copy/pasting parts of code, there's usually a better way – and that way often involves functions!\n", "\n", "Here's an **example** of how functions can **simplify our code**:" ] }, { "cell_type": "code", "execution_count": 44, "id": "7533d4d6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Result for 30: 935\n" ] } ], "source": [ "temp1 = 30\n", "result_1 = temp1**2 + temp1 + 5\n", "print(f\"Result for {temp1}: {result_1}\")" ] }, { "cell_type": "code", "execution_count": 45, "id": "06c86791-4b4b-4739-a0d9-53346d67d171", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Result for 40: 1645\n" ] } ], "source": [ "temp2 = 40\n", "result_2 = temp2**2 + temp2 + 5\n", "print(f\"Result for {temp2}: {result_2}\")" ] }, { "cell_type": "code", "execution_count": 46, "id": "97fc3a8c", "metadata": {}, "outputs": [], "source": [ "def polynomial_function(x):\n", " return (x**2 + x + 5)" ] }, { "cell_type": "code", "execution_count": 47, "id": "57c2a262", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "935" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "polynomial_function(temp1)" ] }, { "cell_type": "code", "execution_count": 48, "id": "3b0b931a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1645" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "polynomial_function(temp2)" ] }, { "cell_type": "markdown", "id": "064e5ac6-6305-4b20-8be3-c084db628888", "metadata": {}, "source": [ "### Multiple return values" ] }, { "cell_type": "code", "execution_count": 49, "id": "ab5b40ed-930f-484d-bb6e-18c4949df11b", "metadata": {}, "outputs": [], "source": [ "def min_max_average(numbers):\n", " minimum = min(numbers)\n", " maximum = max(numbers)\n", " average = sum(numbers) / len(numbers)\n", " return minimum, maximum, average" ] }, { "cell_type": "code", "execution_count": 50, "id": "3d12a48d-c3ff-489e-b77b-4539b5d0993c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1, 9, 5.142857142857143)" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers = [4, 2, 9, 7, 5, 1, 8]\n", "min_max_average(numbers)" ] }, { "cell_type": "markdown", "id": "beea210a-e9e4-490b-9f75-2514c561b30b", "metadata": {}, "source": [ "**Note, how this returns a tuple: (min, max, average)**" ] }, { "cell_type": "code", "execution_count": 51, "id": "e205f866-e8c7-4b59-bf2f-71ca96b61db0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Minimum: 1\n", "Maximum: 9\n", "Average: 5.14\n" ] } ], "source": [ "# We can \"unpack\" the tuple like this:\n", "min_val, max_val, avg_val = min_max_average(numbers)\n", "\n", "print(f\"Minimum: {min_val}\")\n", "print(f\"Maximum: {max_val}\")\n", "print(f\"Average: {avg_val:.2f}\")" ] }, { "cell_type": "markdown", "id": "fb7bbc67-1a78-49f8-a840-1f99e39cee2c", "metadata": {}, "source": [ "> 💡 **Note**: For **math operations** like this we will be using **numpy (section 8)** and numpy arrays. Numpy has a lot of built in functions that have **multiple return values**" ] }, { "cell_type": "markdown", "id": "b925ebae-fce4-4b8c-ad56-0bff40df3193", "metadata": {}, "source": [ "### Understanding Scope in Python 🔍\n", "\n", "In Python, the scope of a variable determines where it can be accessed in your code. Let's explore two main types of scope:\n", "\n", "### Global Scope\n", "\n", "- Variables defined outside of functions\n", "- Accessible throughout the entire code\n", "- Can be read from anywhere, but modifying them requires special handling\n", "\n", "### Local Scope\n", "\n", "- Variables defined inside functions\n", "- Only accessible within that specific function\n", "- Helps prevent naming conflicts and unintended modifications\n", "\n", "> 💡 **Best Practice**: Keep the scope of variables as small as possible. This helps prevent errors and makes your code more maintainable." ] }, { "cell_type": "markdown", "id": "397b0541-6cf2-4394-9a6b-edbc6ad5dc11", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "e8809f07-d12c-4ab4-9951-a0c4db3b4a33", "metadata": {}, "source": [ "### Why Scope Matters\n", "\n", "1. **Prevents Naming Conflicts**: Local variables can have the same name as global variables without interfering with each other.\n", "2. **Improves Code Organization**: Clearly defined scopes make it easier to understand where variables are used and modified.\n", "3. **Enhances Debugging**: Limiting scope makes it easier to track down issues in your code." ] }, { "cell_type": "code", "execution_count": 52, "id": "6039f9f5-5d6a-48e1-acb6-b7f5d0a633e2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I'm global - inside the function\n", "I'm local - inside the function\n", "I'm global\n" ] } ], "source": [ "# Let's look at an example of global vs. local variables\n", "global_var = \"I'm global\"\n", "\n", "def scope_example():\n", " local_var = \"I'm local\"\n", " print(global_var + \" - inside the function\") # Can access global variables\n", " print(local_var + \" - inside the function\") # Can access local variables\n", "\n", "scope_example()\n", "print(global_var) # This works\n", "#print(local_var) # This would raise an error because local_var is not accessible here" ] }, { "cell_type": "markdown", "id": "2652a3eb-6b1b-46d6-b2da-2aecb4033a09", "metadata": {}, "source": [ "### Good Programming Practice ⭐\n", "\n", "While global variables are sometimes necessary, it's generally considered good programming practice to avoid accessing them directly within functions. Instead:\n", "\n", "- Pass required data as input parameters to your functions\n", "- Return modified values from functions rather than changing global state\n", "\n", "This approach, known as \"passing parameters,\" offers several benefits:\n", "1. **Improved Readability**: It's clear what data the function needs to operate.\n", "2. **Better Testability**: Functions that don't rely on global state are easier to test.\n", "3. **Reduced Side Effects**: Functions don't unexpectedly modify global variables.\n", "4. **Enhanced Reusability**: Functions can be used in different contexts without relying on specific global variables.\n", "\n", "Example of good practice:" ] }, { "cell_type": "code", "execution_count": 53, "id": "613412f3-37f0-4f92-9737-290e45e86d9f", "metadata": {}, "outputs": [], "source": [ "# Instead of this:\n", "global_data = 10\n", "\n", "def process_data():\n", " global global_data\n", " return global_data * 2\n", "\n", "# Prefer this:\n", "def process_data(input_data):\n", " return input_data * 2\n", "\n", "result = process_data(10)" ] }, { "cell_type": "markdown", "id": "99e6a0da", "metadata": {}, "source": [ "### Error messages ⚠️" ] }, { "cell_type": "markdown", "id": "bc4fd295-45f0-40ae-b32f-ea82e1db1d84", "metadata": {}, "source": [ "```python\n", "def polynomial_function(x):\n", " return (x**2 + x + 5)\n", "```" ] }, { "cell_type": "code", "execution_count": 54, "id": "ac462a1e", "metadata": {}, "outputs": [], "source": [ "test_list = [1, 2, 3]\n", "#polynomial_function(test_list)" ] }, { "cell_type": "markdown", "id": "16f3b110-9e6c-4386-8b6f-00863e9fab32", "metadata": {}, "source": [ "We will see later how to apply mathematical functions to numpy arrays (so multiple values at once). It does not work with lists\n", "> \"Ask Anaconda Cloud 🤖💬: Why are we getting an error message here?" ] }, { "cell_type": "markdown", "id": "0e7a995e-b295-44d7-b0b9-ac73d5dcfedf", "metadata": {}, "source": [ "### Common Errors (How to Handle Them)\n", "\n", "When learning Python, encountering errors is part of the process. Here are some common errors you might face and how to address them:\n", "\n", "1. **SyntaxError**\n", " - Cause: Incorrect Python syntax\n", " - Example: `print \"Hello World\"` (missing parentheses in Python 3)\n", " - Fix: Correct the syntax: `print(\"Hello World\")`\n", "\n", "2. **IndentationError**\n", " - Cause: Incorrect indentation of code blocks\n", " - Example:\n", " ```python\n", " if True:\n", " print(\"This is incorrectly indented\")\n", " ```\n", " - Fix: Properly indent the code:\n", " ```python\n", " if True:\n", " print(\"This is correctly indented\")\n", " ```\n", "\n", "3. **NameError**\n", " - Cause: Using a variable or function name that hasn't been defined\n", " - Example: `print(undefined_variable)`\n", " - Fix: Ensure the variable is defined before use or check for typos in the name\n", "\n", "4. **TypeError**\n", " - Cause: Performing an operation on an inappropriate data type\n", " - Example: `\"2\" + 2` (trying to add a string and an integer)\n", " - Fix: Convert types appropriately: `int(\"2\") + 2` or `\"2\" + str(2)`\n", "\n", "5. **IndexError**\n", " - Cause: Trying to access a list index that doesn't exist\n", " - Example: `my_list = [1, 2, 3]` then `print(my_list[3])`\n", " - Fix: Ensure your index is within the valid range: `print(my_list[2])` (remember, indexing starts at 0)\n", "\n", "6. **KeyError**\n", " - Cause: Trying to access a dictionary key that doesn't exist\n", " - Example: `my_dict = {\"a\": 1, \"b\": 2}` then `print(my_dict[\"c\"])`\n", " - Fix: Check if the key exists before accessing or use the `.get()` method: `my_dict.get(\"c\", \"Key not found\")`\n", "\n", "When you encounter an error:\n", "1. Read the (end of the) error message carefully - it often points to the line where the error occurred and gives a description of the problem.\n", "2. Check the line number and surrounding code for issues.\n", "3. If you're unsure, try searching the error message online or ask AI tools for help.\n", "\n", "Remember, errors are not failures - they're opportunities to learn and improve your code!" ] }, { "cell_type": "markdown", "id": "f9a1bba6-d59b-49f3-bcdf-92dc65e55d2c", "metadata": {}, "source": [ "## Loops 🔄\n", "By the end of this section, you will be able to:\n", "1. Understand the concept of **iteration** in programming\n", "2. Write and use **while loops** for indefinite iteration\n", "3. Implement **for loops** to iterate over sequences (like lists or strings)\n", "5. Avoid and handle infinite loops" ] }, { "cell_type": "markdown", "id": "c2609b96-b5e6-408e-8afc-00da192f807e", "metadata": {}, "source": [ "### while loop" ] }, { "cell_type": "code", "execution_count": 55, "id": "303f4f0f-84a3-47de-9480-215e65ddc664", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "3\n", "4\n", "5\n", "Have fun\n" ] } ], "source": [ "i = 0\n", "while i < 6:\n", " print(i)\n", " i = i + 1\n", "\n", "print(\"Have fun\")" ] }, { "cell_type": "markdown", "id": "09046e3c-e12b-4e6d-b7f8-ccc141ed89cb", "metadata": {}, "source": [ "---\n", "**careful** when defining your criteria, if it is always true the loop will never end\n", "\n", "this can be useful in generators though that we cover in the advanced courses\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 56, "id": "6d1a8814-4e73-4449-9803-c00b696f57c3", "metadata": {}, "outputs": [], "source": [ "# use kernel interrupt should you be stuck\n", "# Keyboard shortcut: Esc i i\n", "\n", "#while True:\n", "# print(\"This is an infinite loop that will never end\")" ] }, { "cell_type": "markdown", "id": "31054a7d-9d18-483f-ae2c-dd30b2898ba7", "metadata": {}, "source": [ "### for loop" ] }, { "cell_type": "code", "execution_count": 57, "id": "59e2300c-6eeb-40b1-b920-84a3487ed33a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "H\n", "e\n", "l\n", "l\n", "o\n", " \n", "W\n", "o\n", "r\n", "l\n", "d\n", "Have fun\n" ] } ], "source": [ "for letter in \"Hello World\":\n", " print(letter)\n", "\n", "print(\"Have fun\")\n", "\n", "# Note that \"letter\" is not a keyword in Python, we could give it another name\n", "# Jupyter marks Python keywords in green" ] }, { "cell_type": "code", "execution_count": 58, "id": "d32511ba-6bcb-4c95-a015-76f1f1d80322", "metadata": {}, "outputs": [], "source": [ "binaries = [2, 4, 8, 16, 32, 64]" ] }, { "cell_type": "code", "execution_count": 59, "id": "e9b37075-1611-4bf0-a155-6cced8ab72b7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I am at 2 now\n", "I am at 4 now\n", "I am at 8 now\n", "I am at 16 now\n", "I am at 32 now\n", "I am at 64 now\n" ] } ], "source": [ "for number in binaries:\n", " print(f\"I am at {number} now\")" ] }, { "cell_type": "markdown", "id": "94460c61-32c4-4339-a9fd-8717fcb6388c", "metadata": {}, "source": [ "> Ask Claude 🤖💬: `Explain the difference between a for loop and a while loop in Python. When would you use one over the other?`\n", "> \n", "> Ask ChatGPT 🤖💬: `Can you give me an example of how to use a for loop to iterate over a dictionary in Python?`" ] }, { "cell_type": "markdown", "id": "9893089a-cec8-425d-9b68-98c0a38aa949", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "id": "d35cb71a-0e72-4422-9c28-57e33c7bcf2e", "metadata": {}, "source": [ "## Keyboard Shortcuts ⌨️\n", " > **Note**: The shortcuts shown here also work in VS Code by the way." ] }, { "cell_type": "markdown", "id": "888b68b2-095c-436d-8e56-902583646501", "metadata": {}, "source": [ "### General Shortcuts\n", "| Shortcut | Action |\n", "|----------|--------|\n", "| `[ESC]` | Go to command mode |\n", "| `[Enter]` | Enter edit mode for the selected cell |" ] }, { "cell_type": "markdown", "id": "9abd2041-4fc3-4dda-a82e-9c2fd4fc738f", "metadata": {}, "source": [ "### Command Mode Shortcuts\n", "| Shortcut | Action |\n", "|----------|--------|\n", "| `i i` | Interrupt Kernel |\n", "| `a` | Insert cell above |\n", "| `b` | Insert cell below |\n", "| `m` | Convert cell to Markdown |\n", "| `y` | Convert cell to Code |\n", "| `d d` | Delete selected cells |\n", "| `Shift Enter` | Execute cell and select below |\n", "| `Ctrl Enter` | Execute cell and stay |\n", "| `↑` / `↓` | Select cell above/below |\n", "| `Shift ↑` / `↓` | Extend selection above/below |\n", "| `Shift m` | Merge selected cells |\n", "| `Ctrl Shift -` | Split cell at cursor |\n", "| `c` | Copy selected cells |\n", "| `v` | Paste cells below |\n", "| `x` | Cut selected cells |\n", "\n", "### Edit Mode Shortcuts\n", "| Shortcut | Action |\n", "|----------|--------|\n", "| `Ctrl /` | Comment/uncomment selected lines (Windows/Linux); On Mac use `Cmd /`|\n", "| `Ctrl z` | Undo (within a Cell); On Mac use `Cmd z` |\n", "| `Ctrl Shift z` | Redo (within a Cell); On Mac use `Cmd Shift z` |\n", "\n", "---\n", "For more shortcuts, visit: [Jupyter Notebook Shortcuts](https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330)" ] }, { "cell_type": "markdown", "id": "b1732ec7-1260-40f7-b7f8-c4364e4c816d", "metadata": {}, "source": [ "## Virtual Environments 📦\n", "By the end of this section, you will be able to:\n", "1. Explain the **purpose and benefits** of virtual environments in Python\n", "2. Create and activate a virtual environment using **conda**\n", "3. Install and manage **libraries** within a virtual environment\n", "4. Understand the difference between **conda and pip** for package management\n", "5. Create, use and export **environment.yml** files for project reproducibility" ] }, { "cell_type": "markdown", "id": "ea345ab4", "metadata": {}, "source": [ "### Libraries\n", "A library in Python (like NumPy or Matplotlib) is simply a **collection of pre-written code** that someone has created to solve specific problems. You can use this code in your own programs without having to write it from scratch.\n", "\n", "The vast number of available libraries is a great strength of Python because they **enhance the functionality and versatility** of Python." ] }, { "cell_type": "code", "execution_count": 60, "id": "e8defef0", "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 61, "id": "27651817-ad6f-4c05-a04d-8cf2d5bcd9a5", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.965247Z", "start_time": "2022-03-08T14:59:04.952251Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 4, 8, 16]\n" ] } ], "source": [ "# Convert a list to a numpy array\n", "\n", "test_list = [1, 2, 4, 8, 16]\n", "print(test_list)" ] }, { "cell_type": "code", "execution_count": 62, "id": "f426f866-4cfc-4513-a6e1-c93777e709b2", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.980254Z", "start_time": "2022-03-08T14:59:04.968255Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1 2 4 8 16]\n" ] } ], "source": [ "test_list_converted = np.asarray(test_list)\n", "print(test_list_converted)" ] }, { "cell_type": "code", "execution_count": 63, "id": "a24717f5", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 64, "id": "8013eb55", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.plot(test_list_converted)" ] }, { "cell_type": "markdown", "id": "a1bcae5e-04ce-4328-af0c-ac0a9c46b88f", "metadata": {}, "source": [ "### Why use virtual environments?\n", "\n", "**In a nutshell**: Think of environments like sandboxes and the libraries (like Numpy) like children. If you put too many children in one sandbox there will be conflicts. So it is better to have separate sandboxes (environments) to keep it civil.\n", "\n", "> Ask Claude 🤖💬: `Why are virtual environments important in Python development?`\n", "\n", "In Python, virtual environments are used to create isolated environments for your projects, allowing you to manage dependencies and packages separately for each project. They are used to solve a common problem in software development: conflicting dependencies and package versions.\n", "\n", "In Detail:\n", "\n", "- **Isolation**: Virtual environments provide a sandboxed environment for your Python projects. Each virtual environment is self-contained, meaning it has its own directory structure and doesn't interfere with other Python projects or the system-wide Python installation. This isolation helps prevent conflicts between packages and dependencies.\n", "\n", "- **Dependency Management**: In a virtual environment, you can install and manage specific versions of Python packages and libraries independently of the global Python environment. This allows you to specify the exact package versions required for your project.\n", "\n", "- **Collaboration**: When collaborating on a Python project, you can share the project's virtual environment configuration with others. They can then create the same virtual environment, ensuring that everyone is working with the same packages.\n", "\n", "- **Testing and Development**: Virtual environments are crucial for testing and development. They allow you to create an isolated environment where you can experiment with different package versions and configurations.\n", "\n", "- If you install everything in your base environment installing new packages can lead to conflicts" ] }, { "cell_type": "markdown", "id": "2270dfc7-5cde-4b81-ad4b-d457a71def95", "metadata": {}, "source": [ "### Anaconda environments\n", "\n", "Anaconda environments are the go to solution when you are on your own computer and you can install anaconda or if you are in Anaconda Cloud.\n", "\n", "Very nice cheat sheet about the commands:\n", "\n", "https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf\n", "\n", "**Advantages:**\n", "\n", " - Anaconda automatically checks dependencies for your packages and installs the necessary additional libraries\n", " - It is easy to switch environments\n", " \n", "**Disadvantages:**\n", "\n", " - Anaconda is a bit intrusive\n", " - Not available on every machine\n", "\n", "Benchmarks have shown that since anaconda uses packages that use the Intel MKL (Math kernel library) they can often be faster on Intel CPUs than when using PIP:\n", "\n", "https://www.youtube.com/watch?v=AWWaL6pZieo\n", "\n", "**Pip** is nevertheless used a lot. Often when you google how to install a package, you will find the pip install command first. \n", "\n", "**DO NOT MIX** Conda and pip installs in the same environment, as both package managers cannot cross check for compatibiltity with the packages installed by the other one. \n", "\n", "You will **not** get an error message and the installation will most likely go through **BUT** you might get **unexpected behavior** later. \n", "\n", "Only use pip install in a conda environment as a last resort. By now 95% of packages available in pip are also available in conda. Just google \"conda install packagename\"" ] }, { "attachments": {}, "cell_type": "markdown", "id": "a53f4137-01c8-45a9-a51c-95fdf4c7e8ca", "metadata": {}, "source": [ "### Yaml files\n", "\n", "The best and easiest way to install virtual environments is by creating .yaml files that contain all the packages of your virtual environment. This way, the conda package manager can check beforehand which version numbers of all the packages work together.\n", "\n", "They are regular text files with a .yml or .yaml file ending and look like this:\n", "\n", "--------------------------------------------\n", "name: lab_python_course_env\n", "channels:\n", " - conda-forge\n", " - defaults\n", "\n", "dependencies:\n", "\n", " - jupyterlab\n", " - matplotlib\n", " - numpy\n", " - scipy\n", " - pandas\n", " - altair\n", " - h5py\n", " - openpyxl\n", " - vega_datasets\n", " - sympy\n", " - dask\n", " - ipywidgets\n", " - ipympl\n", " - nodejs\n", " - conda-forge::ffmpeg\n", " - conda-forge::jsonschema-with-format-nongpl\n", " - conda-forge::webcolors\n", " \n", "install by running this command in the command line while in the same folder as the .yml file:\n", "> `conda env create --file lab_python_course_env.yml`\n", "\n", "--------------------------------------------\n", "In .yml files **name** specifies the environment name, **channels** tell conda where to look for the packages, **dependencies** are the libraries that you want." ] }, { "cell_type": "markdown", "id": "f2d84fb9-9309-4a5a-9d82-bfd6704ca5ef", "metadata": {}, "source": [ "### Add Conda to Powershell\n", "\n", "In Windows out of the box you unfortunately have to deal with multiple shells. To add the functionality of conda commands in Windows Powershell use these two commands.\n", "\n", "In Anaconda Prompt:\n", "> `conda init powershell`\n", "\n", "In Powershell:\n", ">`Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser`" ] }, { "cell_type": "markdown", "id": "1a7c1a3d-e583-4e08-ad84-99eed6afd0e4", "metadata": {}, "source": [ "### Activating environments\n", "\n", "**After installation you need to activate the environment, it does not activate itself**\n", "\n", "> `conda activate python_course_env`\n", "\n", "You need to do this every time you start a new shell as the default environment is the base environment" ] }, { "cell_type": "markdown", "id": "35c6cca7-bac2-47a3-8336-ee9955901cd9", "metadata": {}, "source": [ "### Default environments in Jupyter Lab\n", "\n", "- Click on your active environment in the top right\n", "- Click on the gear icon\n", " \n", "\n", "\n", "- Copy the Python Path from the environment you want to make default\n", "\n", "\n", " \n", "- In the settings, paste it into the first box\n", "\n", "" ] }, { "cell_type": "markdown", "id": "09d55970-91ad-4d58-820a-03b102d1f0c4", "metadata": {}, "source": [ "Installed environment are available everywhere on your computer, not just in the folder you installed them\n", "> Ask Claude 🤖💬: `Give me a machine learning example in Python + the YAML file with the necessary libraries`" ] }, { "cell_type": "markdown", "id": "2f8d96f8-ebfb-4417-b31e-c6055e49c637", "metadata": {}, "source": [ "### Anaconda Cloud\n", "\n", "Anaconda cloud has **pre-installed** environments but you **can't install additional libraries** into them.\n", "\n", "You can **install your own environment** with a .yaml file the same way you can do on your computer.\n", "\n", "> **However**: for free you only get 5GB of storage, so make sure to **clear the cache** after installation (Disk usage->Clear Cache)." ] }, { "cell_type": "markdown", "id": "2f918f46-9716-41a2-953c-b14a271863fb", "metadata": {}, "source": [ "### Exporting environments\n", "\n", "You can export your current environment with the exact version numbers of all the libraries using \n", "\n", "> `conda env export > export_env.yml`\n", "\n", "This will guarantee that the other person has the **exact same setup** as you.\n", "\n", "When you are using a version control software like **GIT** (we cover gid in the advanced courses), you can put the **.yaml file** for your Python code into the **repository**" ] }, { "cell_type": "markdown", "id": "603f6146-e892-4cb7-adbe-84c9d496720f", "metadata": {}, "source": [ "### Word of advice\n", "\n", "Once you have a working environment, do not update/change it. If you need more modules later it is generally a better idea to create a new environment with the additional packages such that conda can check again which version numbers are compatible" ] }, { "cell_type": "markdown", "id": "194a2b3f-afb1-4eca-89f2-1c9cb9b63a92", "metadata": {}, "source": [ "## First steps with numpy 🔢\n", "By the end of this section, you will be able to:\n", "1. Understand the basic concept and **benefits of NumPy arrays**\n", "2. **Create** and manipulate NumPy arrays\n", "3. Perform basic **mathematical operations** on NumPy arrays\n", "4. Use NumPy's built-in functions for **array manipulation and analysis**\n", "5. Recognize the **performance benefits** of NumPy over standard Python lists" ] }, { "cell_type": "markdown", "id": "07eba926-bfd2-4b79-8b5c-f23bd4ea34f2", "metadata": {}, "source": [ "Numpy arrays are stored continuously in memory. Processing can therefore be 100x faster than lists.\n", "\n", "Built in functions for fast computation are written in C or C++" ] }, { "cell_type": "code", "execution_count": 65, "id": "6b950ac8-f9c7-4a42-9c29-1d4a41faf230", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.902255Z", "start_time": "2022-03-08T14:59:04.657247Z" } }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "id": "9c914d88-ad84-452d-8814-65eee0be5f0d", "metadata": {}, "source": [ "### Initialize an array" ] }, { "cell_type": "code", "execution_count": 66, "id": "3dcda521-61b8-48e3-9f98-d1d681c8c6a6", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.917253Z", "start_time": "2022-03-08T14:59:04.908254Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4 5]\n" ] } ], "source": [ "# Initialize 1D array\n", "\n", "test_array = np.array([1, 2, 3, 4, 5])\n", "print(test_array)" ] }, { "cell_type": "code", "execution_count": 67, "id": "fdf94ea3-042c-4011-81f9-d0dd156e3416", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.965247Z", "start_time": "2022-03-08T14:59:04.952251Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 3, 5, 7, 9]\n" ] } ], "source": [ "# Convert a list to a numpy array\n", "\n", "test_list = [1, 3, 5, 7, 9]\n", "print(test_list)" ] }, { "cell_type": "code", "execution_count": 68, "id": "03137193-8a60-45cb-b406-f44ff598db64", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:04.980254Z", "start_time": "2022-03-08T14:59:04.968255Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 3 5 7 9]\n" ] } ], "source": [ "test_list_converted = np.asarray(test_list)\n", "print(test_list_converted)" ] }, { "cell_type": "code", "execution_count": 69, "id": "6420e350-cf25-4807-9b43-5c5323a76950", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5, 0. ,\n", " 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create an array with a range of values and the step in between\n", "arr1 = np.arange(start=-5, stop=5, step=0.5)\n", "arr1" ] }, { "cell_type": "code", "execution_count": 70, "id": "88a76fff-511d-46b0-acf8-a0c89a6d08fb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5, 0. ,\n", " 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create an array in a range with a fixed number of values (symmetric)\n", "arr2 = np.linspace(start=-5, stop=5, num=21)\n", "arr2" ] }, { "cell_type": "markdown", "id": "ae160989-096c-4a61-980c-246f5c9c92fc", "metadata": {}, "source": [ "### Accessing array elements\n", "Numpy arrays start counting at 0 (like in C):" ] }, { "cell_type": "code", "execution_count": 71, "id": "2c528b3d-08fc-4ac3-a341-09818cf8a943", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:05.043251Z", "start_time": "2022-03-08T14:59:05.018254Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First element: 2\n" ] } ], "source": [ "# Accessing 1D array\n", "\n", "arr = np.array([2, 4, 8, 16])\n", "\n", "print(f'First element: {arr[0]}')" ] }, { "cell_type": "code", "execution_count": 72, "id": "852ef710-3254-4962-9ffb-08f2e7e4db82", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:05.074251Z", "start_time": "2022-03-08T14:59:05.062250Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Last element: 16\n" ] } ], "source": [ "# Accessing array counting from the end (negative indexing)\n", "\n", "print(f'Last element: {arr[-1]}')" ] }, { "cell_type": "markdown", "id": "179dd179-0380-4037-8a0e-a18ec661763c", "metadata": {}, "source": [ "### Array Iteration" ] }, { "cell_type": "code", "execution_count": 73, "id": "51820452-fe12-4043-a66f-412c24313da9", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:05.213248Z", "start_time": "2022-03-08T14:59:05.202249Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "4\n", "8\n", "16\n" ] } ], "source": [ "# Simple iteration on 1D array similar to lists\n", "for x in arr:\n", " print(x)" ] }, { "cell_type": "code", "execution_count": 74, "id": "1fa1ab6e-7134-440d-9c0d-f8a3634b54df", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:05.229250Z", "start_time": "2022-03-08T14:59:05.218252Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "index 0: 2\n", "index 1: 4\n", "index 2: 8\n", "index 3: 16\n" ] } ], "source": [ "# Or use enumerate, if you want the loop iteration index\n", "for idx, x in enumerate(arr):\n", " print(f\"index {idx}: {x}\")" ] }, { "cell_type": "markdown", "id": "9aa42ea8-5260-4ba8-a024-63f31f534101", "metadata": {}, "source": [ "### numpy.where for searching" ] }, { "cell_type": "code", "execution_count": 75, "id": "bbf02e4c-d1c7-451c-a005-bd5822ea4edb", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:05.489250Z", "start_time": "2022-03-08T14:59:05.477249Z" } }, "outputs": [ { "data": { "text/plain": [ "(array([1]),)" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Suppose you want to find the indices where the value of an array is 4:\n", "\n", "np.where(arr == 4)" ] }, { "cell_type": "markdown", "id": "5b031d5b-249e-4cc4-946c-32af397f2410", "metadata": {}, "source": [ "### np.min() np.max() etc." ] }, { "cell_type": "code", "execution_count": 76, "id": "6c1ca97c-54c5-4b4c-8326-5075caa6587f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The minimum of arr is 8 at index 1\n", "The maximum of arr is 1024 at index 3\n" ] } ], "source": [ "arr = np.array([16, 8, 32, 1024, 64])\n", "minimum = np.min(arr)\n", "maximum = np.max(arr)\n", "min_idx = np.argmin(arr)\n", "max_idx = np.argmax(arr)\n", "\n", "print(f\"The minimum of arr is {minimum} at index {min_idx}\")\n", "print(f\"The maximum of arr is {maximum} at index {max_idx}\")" ] }, { "cell_type": "markdown", "id": "1b185d53-19f5-4ffa-9788-59b077f04232", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": 77, "id": "1abbb1a3-e634-4a31-9e08-91eb143474e8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The squareroot of 16 is 4.0\n" ] } ], "source": [ "num = 16\n", "print(f\"The squareroot of {num} is {np.sqrt(num)}\")" ] }, { "cell_type": "code", "execution_count": 78, "id": "98b1a6a8-6a4d-44ae-be5f-923f7e41ed89", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.141592653589793" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.pi" ] }, { "cell_type": "code", "execution_count": 79, "id": "6052021d-5005-41df-b174-e1132ea1b3cf", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7.38905609893065" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.exp(2)" ] }, { "cell_type": "code", "execution_count": 80, "id": "ff7ecfed-4085-4972-9dd6-1936c9932a3d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.9" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.abs(-4.9)" ] }, { "cell_type": "markdown", "id": "147be3cb-49e9-4f99-bf2a-dfe9528ec895", "metadata": {}, "source": [ "### Summary\n", "\n", "This is not a full tutorial on Numpy, this is just a quick look at what external libraries can do.\n", "\n", "In the advanced courses (https://training-scientists.com) we look at\n", "\n", "+ multidimensional array initialization, access, slicing and iteration\n", "+ joining / stacking of arrays\n", "+ Filtering arrays (e.g. give me all values in the array larger than a certain value, or all even numbers)\n", "+ Performance comparisons between numpy arrays and lists\n", "+ pre-allocation of arrays\n", "+ linear algebra\n", "+ statistics\n", "+ Fourier Transforms\n", "\n", "and a lot more hands on examples of everything numpy can do like\n", "\n", "+ interpolation\n", "+ fitting\n", "+ filtering noise out of large data sets" ] }, { "cell_type": "markdown", "id": "f999c1eb-6ffa-4ede-a910-0d49cc2ff215", "metadata": {}, "source": [ "> Ask Claude 🤖💬: `What are the main advantages of using NumPy arrays over Python lists for numerical computations?`" ] }, { "cell_type": "markdown", "id": "02709ee6-9394-4caf-bf82-1ff2b94ea039", "metadata": {}, "source": [ "## First steps with matplotlib 📊\n", "By the end of this section, you will be able to:\n", "1. **Create plots** using Matplotlib (line plots, scatter plots, histograms)\n", "2. **Customize appearance** (colors, labels, titles, legends)\n", "3. Generate and understand different **types of visualizations**" ] }, { "cell_type": "code", "execution_count": 81, "id": "0b30aa28-8e0c-4b20-ab76-7e4d676d7da8", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:18.201945Z", "start_time": "2022-03-08T14:59:17.720478Z" } }, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "9bc84d22-0c36-4669-8041-3a99edc12db5", "metadata": {}, "source": [ "### x / y plot:" ] }, { "cell_type": "code", "execution_count": 82, "id": "b212c339-29e6-4d6c-a33e-34952a0891e8", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:18.231912Z", "start_time": "2022-03-08T14:59:18.212927Z" } }, "outputs": [], "source": [ "# Generate two arrays to be plotted as x-values\n", "\n", "x = np.linspace(start=0, stop=10, num=11)\n", "x_fine = np.linspace(start=0, stop=10, num=101)" ] }, { "cell_type": "markdown", "id": "e62d45af-dbeb-48a4-bb5b-246055e1fa00", "metadata": {}, "source": [ "Polynomial function from functions section\n", "DO NOT define a function twice in a production script\n", "```python\n", "def polynomial_function(x):\n", " return (x**2 + x + 5)\n", "```" ] }, { "cell_type": "code", "execution_count": 83, "id": "fd217f5a-d655-471f-8825-a4e2f523740a", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:18.247912Z", "start_time": "2022-03-08T14:59:18.234922Z" } }, "outputs": [], "source": [ "# Calculate the y values running the polynomial function on the x values\n", "\n", "y = polynomial_function(x)\n", "y_fine = polynomial_function(x_fine)" ] }, { "cell_type": "code", "execution_count": 84, "id": "7412ff8d-7ad2-41d5-a31f-a92b15527841", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:18.498924Z", "start_time": "2022-03-08T14:59:18.250916Z" } }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Create a figure to plot these two x/y sets\n", "\n", "plt.figure()\n", "plt.plot(x,y,'o')\n", "plt.plot(x_fine, y_fine)" ] }, { "cell_type": "code", "execution_count": 85, "id": "bdafb0fb-946a-40ff-a6ec-66452a2e9301", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:19.060480Z", "start_time": "2022-03-08T14:59:18.508925Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Customize the plot\n", "\n", "plt.rcParams.update({'font.size': 14}) # Increase font size (for entire notebook)\n", "\n", "plt.figure(figsize = (6,6)) # Change figure size and aspect ratio\n", "\n", "plt.plot(x,y, '*', label='Discrete') # Add labels\n", "plt.plot(x_fine, y_fine,ls='--', label='Quasi-continuous')\n", "\n", "plt.xlabel('$x$') # Use LaTeX notation\n", "plt.ylabel('$y = x^2 + x + 5$')\n", "\n", "plt.legend(loc='upper left') # Add legend" ] }, { "cell_type": "markdown", "id": "e6cf5f8e-68ef-45c3-b694-89abd1b484c6", "metadata": {}, "source": [ "### Scatter plots" ] }, { "cell_type": "code", "execution_count": 86, "id": "41448e21-f516-4dae-b178-faffeb76387f", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:19.075481Z", "start_time": "2022-03-08T14:59:19.063481Z" } }, "outputs": [], "source": [ "# Generate array of random numbers with normal distribution for x values:\n", "x_rand = np.random.normal(loc=3, scale=1.0, size=2000)\n", "# The same for y values:\n", "y_rand = np.random.normal(loc=3, scale=1.0, size=2000)" ] }, { "cell_type": "code", "execution_count": 87, "id": "db3b2e50-d322-405c-b7d4-c277268c0611", "metadata": {}, "outputs": [], "source": [ "# define the dictionaries for plotting options:\n", "random_dict = {\"color\": 'gray', \"label\": 'Random values'}\n", "mean_dict = {\"s\":200, \"color\":'blue', \"marker\":'*', \"label\": 'Average'}" ] }, { "cell_type": "code", "execution_count": 88, "id": "c71c756b-c9b0-466d-8ee9-3609a7b4dbb3", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:19.308480Z", "start_time": "2022-03-08T14:59:19.077479Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot as scatter\n", "\n", "plt.figure(figsize=(6,6))\n", "plt.scatter(x_rand, y_rand, **random_dict) # all values as orange points\n", "\n", "# The mean (x,y) as a blue star:\n", "plt.scatter(np.mean(x_rand), np.mean(y_rand), **mean_dict) \n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.xlim([-1 , 7])\n", "plt.ylim([-1 , 7])\n", "plt.legend(loc='upper left')" ] }, { "cell_type": "markdown", "id": "4259d533-3376-45a3-b1c6-262870c41acd", "metadata": {}, "source": [ "### Histogram" ] }, { "cell_type": "code", "execution_count": 89, "id": "f5ff429a-6541-4885-a962-b9bfe7224750", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:20.433070Z", "start_time": "2022-03-08T14:59:19.311482Z" } }, "outputs": [ { "data": { "text/plain": [ "Text(0, 0.5, 'occurences')" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plotting the same data in a different way using the same plot style\n", "plt.figure(figsize = (6,6))\n", "plt.hist(x_rand, bins=50, **random_dict)\n", "plt.xlabel('x')\n", "plt.ylabel('occurences')" ] }, { "cell_type": "markdown", "id": "f433ccaf-30c1-43e1-82b6-46408e7b6f7c", "metadata": {}, "source": [ ">**when we change the dictionary, all we need to do is rerun the code**" ] }, { "cell_type": "markdown", "id": "820c8b8e-d2c1-4572-bc47-4ca71334ed55", "metadata": {}, "source": [ "### Contour plots" ] }, { "cell_type": "code", "execution_count": 90, "id": "60598ab5-a8b5-4ccf-b341-f7164c0b2725", "metadata": { "ExecuteTime": { "end_time": "2022-03-08T14:59:20.915056Z", "start_time": "2022-03-08T14:59:20.436023Z" } }, "outputs": [ { "data": { "text/plain": [ "Text(0, 0.5, '$y$')" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x_contour = np.arange(0, 11, 0.1)\n", "y_contour = np.arange(0, 11, 0.1)\n", "# Create 2D meshgrid of x and y values:\n", "X,Y = np.meshgrid(x_contour, y_contour)\n", "z = X * Y\n", "\n", "plt.figure(figsize = (6,6))\n", "\n", "plt.contourf(X, Y, z) ##\n", "cbar = plt.colorbar() ##\n", "cbar.set_label('z = x*y') ##\n", "\n", "plt.xlabel('$x$')\n", "plt.ylabel('$y$')" ] }, { "cell_type": "markdown", "id": "0e49155c-e32a-4889-a5e8-ac34abd89125", "metadata": {}, "source": [ "### Pie charts" ] }, { "cell_type": "code", "execution_count": 91, "id": "33954a21-eb5b-48c5-abbc-5280f0ef8afb", "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "sizes = [20, 80]\n", "labels = ['Actual Programming', 'Debugging']\n", "colors = ['gold', 'lightcoral']\n", "explode = (0.1, 0) # explode 1st slice\n", "\n", "plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', startangle=140)\n", "plt.axis('equal')\n", "plt.title('Time spent programming')\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "77fdcb09-8879-4118-bafb-b1dee3b3d324", "metadata": {}, "source": [ "### Summary\n", "\n", "Matplotlib is a powerful plotting library that can produce professional looking plots.\n", "\n", "In the advanced courses we will look at\n", "+ inset Plots\n", "+ interactive plots & widgets\n", "+ creating videos from a series of plots\n", "+ advanced plotting options\n", "+ making plots publication ready\n", "\n", "If you are interested:\n", "+ [Python Basics](https://training-scientists.com/python-basics-course/)\n", "+ [Python for Scientists & Engineers](https://training-scientists.com/python-for-scientists-and-engineers/)\n", "+ [Python for Biologists](https://training-scientists.com/python-for-biologists/)" ] }, { "cell_type": "markdown", "id": "ec45c232-23b7-40a0-8f25-d447bd2f754a", "metadata": {}, "source": [ "> Ask Claude 🤖💬: `What are the key components of a matplotlib figure and what does each do?`\n", "\n", "> Ask ChatGPT 🤖💬: `Create a matplotlib plot with multiple y axes`" ] }, { "cell_type": "markdown", "id": "c30f1684-769c-411f-b1ea-a12b448294db", "metadata": {}, "source": [ "## FAQ ❓ & Common issues I see beginners struggle with" ] }, { "cell_type": "markdown", "id": "bac2d43a-bdfb-402a-ba0c-861fa0f7c009", "metadata": {}, "source": [ "### Whenever I look at your code it makes sense, when I try it I get errors 💻❓\n", "Learning programming is like learning anything else. Think of learning a new language or a music instrument.\n", "\n", "It is one thing to watch someone else play the piano and think \"He plays well\". It is another thing entirely to be able to play yourself.\n", "\n", "A few tips:\n", "\n", "+ keep the code cells small. It is easier to find a bug in a cell with 3 lines of code than to find the needle in the haystack\n", "+ work your way forward with as small increments as possible. Write a line of code, execute. Write another line, execute etc.\n", "+ learn how to read error messages (the important stuff is always in the last line)\n", "+ use AI tools. Copy/paste buggy code to ChatGPT/Claude or use Anaconda Cloud. With basic errors they can usually help" ] }, { "cell_type": "markdown", "id": "d8be4309-afd8-4280-a2e0-6f24c1aeec8b", "metadata": {}, "source": [ "### How to think like a programmer? 🤔\n", "\n", "Tricky question. In this video I showed you the tools you need to get started. Imagine you're a handyman and just learned how to use a screwdriver, hammer, saw etc. 🛠️\n", "Instead of trying to build an entire house now, start small. Build a birds house. Then build a dogs house. Then a garage. And THEN try to build a house.\n", "\n", "A few tips 🧩 :\n", "+ programmers usually have a \"divide and conquer\" approach to solving problems. Break the project you have into as many small parts as possible and then solve them one by one. You climb a mountain one step at a time. Don't try to jump to the top in one go. AI tools might help you get there faster and be your Sherpa but they might also lead you on the wrong path.\n", "Divide and Conquer example:\n", " + File IO 💾\n", " + What format is the data in that in want to read in?\n", " + How do I read that type of data in?\n", " + In what kind of data structure do I want to store the data in Python?\n", " + Read in the data\n", " + Was the data read in correctly? (Compare the array in Python with the data in the file)\n", " + Data analysis 🔍\n", " + Here it might make sense to work backwards and start with the question: how do I want the end result to look like?\n", " + decide for the method of data processing: interpolation, fitting, filtering etc.\n", " + process the data\n", " + check if the result makes sense\n", " + Visualization 📊\n", " + decide for the type of plot (histogram, contour plot etc.) that best represents your result and the point you are trying to make with it\n", " + plot the end result of the data analysis\n", " + make the plot publication ready\n", "+ Learn from other people. Like watching DIY home improvement videos you can learn from how other people solve problems. I will create some videos like this in the future, so subscribe ;)" ] }, { "cell_type": "markdown", "id": "6c9c4dbb-8bfb-473e-b670-834bea1b5659", "metadata": {}, "source": [ "### Naming Variables: ❌❌❌ Avoid Keywords and Built-in Functions 💀💀💀\n", "Basically, if the name turns green, it is a reserved name" ] }, { "cell_type": "code", "execution_count": 92, "id": "f3f638ce-2a3a-4915-8378-a154e78cf8b6", "metadata": {}, "outputs": [], "source": [ "# Absolutely do not do this:\n", "#for = 3" ] }, { "cell_type": "markdown", "id": "a1547d37-f606-4c8e-9a40-8886c15121a8", "metadata": {}, "source": [ "### Python Keywords and Built-in Functions Reference 🐍\n", "\n", "**Keywords** 🔑\n", "\n", "| Keywords | Keywords | Keywords | Keywords | Keywords | Keywords | Keywords |\n", "|----------|----------|----------|----------|----------|----------|----------|\n", "| `False` | `None` | `True` | `and` | `as` | `assert` | `async` |\n", "| `await` | `break` | `class` | `continue`| `def` | `del` | `elif` |\n", "| `else` | `except` | `finally`| `for` | `from` | `global` | `if` |\n", "| `import` | `in` | `is` | `lambda` | `nonlocal`| `not` | `or` |\n", "| `pass` | `raise` | `return` | `try` | `while` | `with` | `yield` |\n", "\n", "**Common Built-in Functions** 🧰\n", "\n", "| Functions | Functions | Functions | Functions | Functions | Functions | Functions | Functions | Functions | Functions |\n", "|-------------|--------------|---------------|--------------|--------------|---------------|--------------|--------------|--------------|--------------|\n", "| `abs()` | `all()` | `any()` | `ascii()` | `bin()` | `bool()` | `bytearray()`| `bytes()` | `callable()` | `chr()` |\n", "| `classmethod()`| `compile()`| `complex()` | `delattr()` | `dict()` | `dir()` | `divmod()` | `enumerate()`| `eval()` | `exec()` |\n", "| `filter()` | `float()` | `format()` | `frozenset()`| `getattr()` | `globals()` | `hasattr()` | `hash()` | `help()` | `hex()` |\n", "| `id()` | `input()` | `int()` | `isinstance()`| `issubclass()`| `iter()` | `len()` | `list()` | `locals()` | `map()` |\n", "| `max()` | `memoryview()`| `min()` | `next()` | `object()` | `oct()` | `open()` | `ord()` | `pow()` | `print()` |\n", "| `property()`| `range()` | `repr()` | `reversed()` | `round()` | `set()` | `setattr()` | `slice()` | `sorted()` | `staticmethod()`|\n", "| `str()` | `sum()` | `super()` | `tuple()` | `type()` | `vars()` | `zip()` | `__import__()`| | |\n", "\n", "**Note:** \n", "- Keywords are reserved and cannot be used as identifiers.\n", "- Built-in functions are predefined but can be overwritten (not recommended).\n", "- `True`, `False`, and `None` are constants but treated as keywords.\n", "- This list is based on Python 3.x and may vary slightly in different versions." ] }, { "cell_type": "markdown", "id": "8d4fdadd-34b1-4120-82a2-55b1a1b5b2b6", "metadata": {}, "source": [ "### What's the difference between `=` and `==` in Python?\n", "\n", "- `=` is the assignment operator. It's used to assign a value to a variable.\n", " Example: `x = 5` assigns the value 5 to the variable x.\n", "- `==` is the equality comparison operator. It's used to check if two values are equal.\n", " Example: `if x == 5:` checks if the value of x is equal to 5.\n", "\n", "### Which one of the AI tools should I be using?\n", "- If you don't want to spend any money I would use all 3 and take advantage of the free usage per day limits\n", "- **Anaconda Cloud** is great for in-place debugging\n", "- **Claude** is great with its Artifacts and versions\n", "- **ChatGPT** just released ChatGPT o1 which takes more time to \"think\" which probably just overtook Claude 3.5 Sonnet\n", "\n", "### How do I choose between using a list, tuple, or dictionary?\n", "\n", "- Use a **list** when you have a collection of related items that may change (mutable) and order matters.\n", " Example: `todo_list = ['Study', 'Exercise', 'Cook']`\n", "- Use a **tuple** for collections of items that shouldn't change (immutable) and order matters.\n", " Example: `coordinates = (4, 5)`\n", "- Use a **dictionary** when you want to store key-value pairs for quick lookup.\n", " Example: `person = {'name': 'Alice', 'age': 30, 'city': 'New York'}`\n", "\n", "### What are some common Python libraries for data analysis and when should I use them?\n", "\n", "**Answer:**\n", "- **NumPy**: For numerical computing and working with arrays. Use when you need to perform mathematical operations on large datasets efficiently.\n", "- **Pandas**: For data manipulation and analysis. Great for working with structured data in tables or time series.\n", "- **Matplotlib**: For creating static, animated, and interactive visualizations. Use when you need to create basic plots and charts.\n", "- **SciPy**: For scientific and technical computing. Use for more advanced statistical functions, optimization, and signal processing.\n", "- We cover all of them in the advanced courses\n", " + [Python Basics](https://training-scientists.com/python-basics-course/)\n", " + [Python for Scientists & Engineers](https://training-scientists.com/python-for-scientists-and-engineers/)\n", " + [Python for Biologists](https://training-scientists.com/python-for-biologists/)\n", "\n", "### How can I collaborate on Python projects with others using version control?\n", "\n", "Use Git as your version control system and GitHub, GitLab, or Bitbucket as your remote repository host. We set this up in the advanced courses, try it out and use it together with practical examples." ] }, { "cell_type": "markdown", "id": "0b5ec1ba-eb48-4370-b881-0872ebda2873", "metadata": {}, "source": [ "### How can I improve my skills further?\n", "\n", "1. The exercises for this course are available on https://training-scientists.com. To get a 50% discount, leave a like and a comment under the video and contact me on LinkedIn.\n", "2. Go ahead and play around with this notebook. If anything is unclear, delete it and see what happens. Break things and understand what they are good for. If you are afraid of breaking things, just create a copy of the notebook.\n", "4. Try solving small problems (that ideally are relevant to you) and start coding. Use AI tools to help you but always understand what you are copy pasting.\n", "5. Consider signing up for one of my advanced courses with lots of hands-on exercises, live Zoom sessions to discuss exercises, ask questions and discuss topics beyond the course\n", "\n", "> \"Ask ChatGPT: Can you suggest some small Python projects for beginners to practice their skills?\"" ] }, { "cell_type": "markdown", "id": "dfb424f5-00f5-46a8-b182-b39b96816dca", "metadata": {}, "source": [ "## Glossary 📚\n", "\n", "| Term | Definition |\n", "|------|------------|\n", "| Variable | A named storage location in computer memory that holds data. |\n", "| Data Type | A classification of data which tells the compiler or interpreter how the programmer intends to use the data. |\n", "| Function | A block of organized, reusable code that performs a specific task. |\n", "| List | An ordered, mutable collection of elements in Python. |\n", "| Tuple | An ordered, immutable collection of elements in Python. |\n", "| Dictionary | An unordered collection of key-value pairs in Python. |\n", "| Loop | A programming construct that repeats a group of commands. |\n", "| Conditional Statement | A programming language construct that performs different computations or actions depending on whether a boolean condition evaluates to true or false. |\n", "| Scope | The region of a program where a variable is recognized and can be used. |\n", "| Indentation | The spaces at the beginning of a code line used to determine the grouping of statements in Python. |\n", "| String | A sequence of characters in Python, typically used to represent text. |\n", "| Boolean | A data type that has one of two possible values: True or False. |\n", "| Index | A number representing the position of an element in a sequence (like a list or string). |\n", "| Slice | A portion of a sequence, specified by a range of indices. |\n", "| Iteration | The process of repeatedly executing a set of statements. |" ] }, { "cell_type": "markdown", "id": "72312e32-1eed-418f-8394-d7cc75a65320", "metadata": {}, "source": [ "## Best Practices Summary 🌟\n", "\n", "Throughout this course, we've covered several best practices for Python programming. Here's a summary of key points to remember:\n", "\n", "1. **Code Readability**\n", " - Use descriptive variable and function names\n", " - Follow PEP 8 guidelines for code style (covered in more detail in the advanced courses)\n", " - Keep lines of code between 79-99 characters long\n", " - Use comments to explain 'why', not 'what'\n", "\n", "2. **Function Design**\n", " - Keep functions small and focused on a single task\n", " - Use parameters instead of relying on global variables\n", " - Return values rather than modifying global state\n", "\n", "3. **Variable Scope**\n", " - Keep variable scope as small as possible\n", " - Avoid using global variables within functions\n", "\n", "5. **Data Structures**\n", " - Choose the appropriate data structure (list, tuple, dictionary) for your needs\n", " - Use NumPy arrays for numerical computations when performance is crucial\n", "\n", "7. **Virtual Environments**\n", " - Use virtual environments to manage dependencies for different projects\n", " - Keep your base Python installation clean\n", "\n", "Remember, writing clean, readable, and maintainable code is a skill that develops with practice. Keep these best practices in mind as you continue your Python journey!" ] }, { "cell_type": "markdown", "id": "16ae3083-08cc-498a-ad43-103a07861475", "metadata": {}, "source": [ "> \"Ask ChatGPT: Can you explain the concept of DRY (Don't Repeat Yourself) in programming and give an example in Python?\"" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "186.667px" }, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "position": { "height": "144.183px", "left": "927px", "right": "20px", "top": "120px", "width": "350px" }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }