As part of our customer engagement, we have defined a naming convention for GitHub repositories.
We have ideas around what conventions made sense, although the type of work influences these. As we tend to spend some of your engineering time working with different languages, conventions around naming formats make sense to the engineers. These, of course, might not make sense to individuals in other fields.
GitHub, by default, prevents certain characters from being included in the repository name, but how do we use those left to provide meaning to engineers, testing engineers, and others who use the code base and GitHub day in and day out?
Semantics
Semantics is the study of meaning when applied to linguistics, semiotics programming languages, and logic. You might be familiar with the application of semantics in a programming language, as distinct from syntax. Furthermore, software developers and DevSecOps engineers will likely have come across semantic versioning (semver.org), aiming to provide meaning to application version numbers.
We can use a semantic approach to repository naming.
Tackling the problem
GitHub maintains a style guide standard (https://github.com/agis/git-style-guide) but does not explicitly list a standard for the repository name itself.
Investigating other articles, guides, repositories, and stack overflow suggested the consensus was to use hyphens to separate portions of the repository name. Some organisations have taken the step to formalise this.
One example is the British Columbia Policy Framework for GitHub Document Naming Repos
This document enumerates criteria for a repository name, which includes:
- Descriptive
- Readable
- Consistent
- Contextual
- Future-friendly
- Extensible
- Reusable
- Brief (short/succinct)
All of these seem like very reasonable suggestions. Using this as a guideline, we have broken down our initial draft standard into three sections separated by hyphens:
section1-section2-section3
This format consists of sections defining the project name, purpose, and framework or language. An example of what this might look like would be:
project1-restapi-python
Each of these sections could be further hyphenated if two words needed to be split, for example, “rest-api”.
Conventions
Having drawn up a semantic-based naming structure that drew on general research, the structure we are proposing is:
section1-section2
Including the framework or programming language is not that useful, as many repositories mixed languages—for example, PHP and JavaScript.
Therefore, the format product/project name and the repository purpose, would be more suitable.
For example:
project1-rest-api
There is no one solution that will fit all; therefore, we have come up with two, which addresses both business and project needs and general needs of common repositories.
This approach is suited to a project team or department where multiple products exist and comprise of sub-components, such as microservices.
[product/project name]-[purpose] e.g. myproject-rest-api
Structure hierarchy:
- Top folder
- Sub folder1
- Sub folder 1-1
- Sub folder 1-2
- Sub folder 2
- Sub folder 2-1
- Sub folder 2-2
- Sub folder1
Where a more common pattern would reflect an Open-Source and common library approach:
[language/framework]-[product/project] e.g. python-security-scripts
There may be more than one correct approach, but teams must use a convention that works for them and stick to it. Overall, this helps maintain repository hygiene and helps engineers deal with dozens of repositories to quickly find what they are looking for.