Skip to main content

2 posts tagged with "oss"

View All Tags

· 5 min read

Motivation

About one year ago, I had no idea how to contribute to open-source projects and yet it was something that I really wanted to do. I read articles and watched YouTube videos that provided good suggestions and resources, but somehow I just could not get over the initial hurdle and actually contribute to Open Source Software (OSS) projects. I have done things like trying first-contribution, starring and bookmarking projects and doing all the preparation to end up not being able to contribute to any projects out there.

Looking back, two issues prevented me from moving forward:

  1. How to find a project that is of the right technical level and has beginner-friendly issues
  2. After finding one such project, how to navigate through its large codebase and start working?

I have made some progress over the past year and I hope to pen down my thoughts on open-source work. I am not a "10X" developer, but my perspective could be helpful to people who are just starting.

What do I really need to know?

As you can see from the first issue mentioned above, the number of available projects for you depends on your technical ability and interest. Before anything, it is really important to know how Git & GitHub (or Gitlab) works because most OSS projects are done with the help of Git for version control and GitHub (or Gitlab) to host the code for sharing among developers.

In particular, you should minimally understand:

  • What’s a commit and how to write good commit messages?
  • What’s a PR?
  • What’s a branch?
  • What’s branching workflow and forking workflow?

With that out of the way, the main bulb of learning is on the tech stack of the project. This can be difficult if you are completely new. However, my experience has been that depending on the complexity of the project, you don't always need to be an expert in the tech stack to contribute. What I would recommend is to go through some tutorials and try to understand the basics of the tech stack. With that, the rest of the learning can be done on the go.

Something else I feel strongly about is getting to know the product from a user's perspective. This will open doors for you to

  • verify the correctness of the documentation (or lack of it)
  • find edge cases that are not covered
  • think of new features that can be added
  • reproduce bugs and help in the debugging process
  • participate in discussions on how to improve the product

Non-technical contributions can be very important. They also provide chances for you to interact and find out more about the project and the community. After all, the project is not just about the code. It's also about the people who are working on it.

How to understand a project's repo?

It's important to remember that it's okay to not understand everything right away. It may take time and practice to fully grasp the project's codebase. This "advice" is not unique but I think most articles and guides don’t emphasize it enough.

With limited time and also limited interest (you may never want to work on certain aspects of the project), prioritize what you should know according to what you want to do. Here are some tips that I have found useful:

  • make use of the project's developer guide, if available. This document should provide an overview of the project's architecture and workflows.
  • start by focusing on a specific aspect of the project that interests you or that you feel confident in tackling. This could be a bug or a feature that needs to be implemented. As you work on this task, you will naturally become more familiar with the codebase and be able to contribute more effectively.
  • stalk the project's ongoing issues and PRs. This will give you a sense of what the project's maintainers are working on and how they do it.

Contribute when the opportunity arises

I have found that when you start working on just a single project, it will naturally lead to other opportunities (e.g. upstream libraries, similar projects, etc). When I started working on MarkBind, a static site generator, I was also making occasional contributions to some of the plugins for markdown-it, the Markdown parser that MarkBind uses. I was fixing bugs in MarkBind by discovering the root causes in the upstream libraries and pushing fixes there to hopefully benefit other projects as well. I even made a pull request to fix the documentation that I was reading on an MDN page for <tbody>: The Table Body element, something that broke the server-side rendering of MarkBind.

Closing thoughts

Working on OSS projects can be challenging and sometimes frustrating. While it is rewarding to contribute to projects that others may benefit from, it's important to recognize that at times, it's just free labor that some people will not even appreciate. You will also realize that "The reward for good work is more work".

Being a developer involves more than just writing code. It may also include tasks such as investigation, discussion, research, proper documentation, and explaining your code to others. These are important software engineering tasks, but may not always be as satisfying as simply writing code.

My conclusion about OSS is that it's worth trying out and you will learn a lot from it. It's also not as "glamorous" as you may think. People abandon OSS projects all the time and you may not always get the recognition you deserve. I hope that this short article will help you to get started on your journey into open source. Good luck!

· 5 min read

Introduction

A while ago, I posted a question on Dev.to to find out more about technologies for recognizing GitHub Repo contributions:

Hello guys,

I am aware of all-contributors bot that helps with adding new contributors to an OSS project on GitHub, but it does not seem to be able to automatically add all contributors(those who previously contributed, before the point of using all-contributors), perhaps with some sensible defaults. I am wondering if there are any alternative tools out there that you are using to recognize contributors in your GitHub Repo?

Would love to learn more about that:)

The Problem

I needed a way to recognize and show Github Repo contributors in the README file of a project for more visibility. The all-contributors-cli is a CLI tool that fits my needs. However, the problem is that while the Repo already has a dozen contributors, the CLI tool is not able to recognize any of them before the point of using the tool.

Running npx all-contributors-cli init sets up the Repo with the default settings, but it does not recognize any of the existing contributors. One helpful command provided by the CLI is npx all-contributors-cli check. This command checks the Repo for any contributors not yet added in the .all-contributorsrc file and prints the list of names. Given that the Repo that I was working on has about 50 contributors, I would have to take the names from the missing list and manually add them via npx all-contributors-cli add {contributor} {contribution_type}. This is a tedious task, and I would like to automate this process.

The Solution

Inspired by the discussions in related GitHub issues, I think a Python script can help. The final script is made up of several functions that are called in the main function. I also included a few options to handle the common use cases in the script that I put together.

init

This function will initialize the Repo with the default settings. The code basically calls npx all-contributors-cli init.

import subprocess
import shlex
import sys

def init():
print("Initialize all-contributors")
subprocess.run(shlex.split("npx all-contributors-cli init"), shell=True)

check

This function will check the Repo for any contributors not yet added in the .all-contributorsrc file and use the list to call npx all-contributors-cli add {contributor} code. Note that I intentionally ignored dependabot[bot] and set the default contribution type to code. Change them to suit your needs! There is also a dryrun parameter that is used to control the execution of the add command.

def check(dryrun=False):
all_contributors_check_result = subprocess.run(
shlex.split("npx all-contributors-cli check"),
shell=True,
stdout=subprocess.PIPE,
).stdout.decode("utf-8")

missing_contributors = all_contributors_check_result.replace(
"Missing contributors in .all-contributorsrc:\n", ""
).strip()
if missing_contributors in ["", "dependabot[bot]"]: # ignore dependabot[bot]
print("No missing contributors")
return

default_contribution_type = "code" # default contribution type
contributors_to_add = missing_contributors.split(", ")
if "dependabot[bot]" in contributors_to_add:
contributors_to_add.remove("dependabot[bot]") # ignore dependabot[bot]

print("Update .all-contributorsrc to include all contributors read from Github")
for contributor in contributors_to_add:
command = (
f"npx all-contributors-cli add {contributor} {default_contribution_type}"
)
if not dryrun:
print("run: " + command)
subprocess.run(shlex.split(command), shell=True)
else:
print("dryrun: " + command)

generate

This function will update the README file with the list of missing contributors. The code basically calls npx all-contributors-cli generate.

def generate():
print("Update README.md to generate table of contributors")
subprocess.run(shlex.split("npx all-contributors-cli generate"), shell=True)
print("Done!")

main

This is the main function that calls the other functions in the script. To make things simpler, the script can be run with an argument that determines what gets done.

def main():
command = sys.argv[1] if len(sys.argv) > 1 else "help"
if command == "init":
init()
check()
generate()
elif command == "help":
print(
"""
Commands:
init: initialize all-contributors for the first time (will generate .all-contributorsrc)
- python add-all-contributors.py init
add: add missing contributors (when you already have .all-contributorsrc)
- python add-all-contributors.py add
dryrun: dryrun add missing contributors (test without adding)
- python add-all-contributors.py dryrun
"""
)
elif command == "add":
check()
generate()
elif command == "dryrun":
check(True)
else:
print("Unknown command: " + command)
exit(1)

The full code is available as a GitHub Gist

Usage

  1. Understand all-contributors and decide if you want to use it. You should also have recent versions of npm and Python available on your machine.
  2. Put the script in the root directory of your project.
  3. If nothing is done yet (no .all-contributorsrc), run python add-all-contributors.py init. Remove the all-all-contributors.py file and that's it!
  4. If you already have .all-contributorsrc file, run python add-all-contributors.py add. Remove the all-all-contributors.py file and that's it!
  5. If you would like to see a help message, run python add-all-contributors.py.
  6. If you would like to see who will be added without actually doing it, run python add-all-contributors.py dryrun. (Assuming the repository already have a .all-contributorsrc file)

Conclusion

Even though I "solved" my own problem, the comments and discussions on this topic/issue gave me the idea to write this script. So, thanks to random people on the internet for sharing!