Git Attributes for Cleaner Way to Collaborate on Customized Repo

Git Attributes makes it cleaner for developers to collaborate on a highly customized project. Let's see how!

Posted by Wasin Thonkaew on February 8, 2018

From time to time, we will as well share some development technique we used in our main or side projects with you. So our posts won\'t be just announcement. This time we share with you a hidden benefit of Git you might not know before. It's Git Attributes.


Whenever several developers working together on public Git repository project, there is a case when a high customization like access token, end-point URL, user name & password etc. are needed to be modified for testing purpose on local development machine but in a way to not expose internal system to public, and not to make changes to git tree.

There are a few solutions to this

  1. Assume unchanged for configuration file via


    git update-index --assume-unchanged <your file>


    in which after you execute such command, changes you made to such file will be ignored by git.

  2. Have such configuration settings set externally on host machine (either local or server). For example is environment variables which are highly suitable for NodeJS server project and can such values can be easily read in code.

The flaw from 1. is that you can't list files that are set to be ignored or assumed unchanged, so no evidence there. If you want to really make changes to such file, and don't pay attention to changes actually detected by git, you might lose your work.

The flaw from 2. not every project suitable for this approach. If it's an app, you can't easily read from environment variables and plug them into source code, more customized solution is needed.

Good news is there is a better way for this which called Git Attributes.

    What is Git Attributes?

Git Attributes is like .gitignore file which hints Git to know which file Git will ignore its existence or changes. But for Git Attributes (in form of .gitattributes), it lets Git knows which method to apply to target files in order to maintain its unchanged/unaltered stage automatically before committing, but at the same time allows developers to modify such file to have it taken into effect when work or test on local machine.

That's a lot to digest. But it's best to understand from example.

Smudge. Image from git-scm.com.


Clean. Image from git-scm.com.


From above images, there are two main concepts here which are Smudge, and Clean.

Smudge

    It's a method when you make changes to your file when you're working on your project. Imagine you change configuration to suit your testing purpose thus you will be able to test locally.

Clean

    It's a method to get back clean state of your target file whenever you're about to commit the changes to repository.

Both Smudge, and Clean work together to achieve local customized configurations on your project repository allowing you to test something without makging changes on repository itself. Everything is clean.

    Practical Example

The best way to fully understand it is to having our hands dirty.

Assume you have a git repository, it's a NodeJS server project which has ability to test locally with the configurations as set.

Assume the project directory's structure is as follows

MyProject

   |_ index.js

   |_ package.json

   |_ config.js

It's a simple NodeJS project with main file index.js along with its configuration file config.js.

Content of config.js is as follows

module.exports = {

    endpoint_url: "<your endpoint url here>"

}


As you can see, the content of config.js itself won't be working straight out immediately when you checkout repository. You have to make changes to it. This is the clean state. It's ok as we don't want to let others know about server's configuration settings.

So how can we test locally without making changes to a file and thus having such changes on Git? Assume that we want to modify config.js file to have the following configurations.

module.exports = {

    endpoint_url: "https://apps.abzi.co/myproject"

}

Here is the steps.

  1. Create .gitattributes file at the root of your project.

  2. Enter the following line into a file

    config.js    filter=config_hide


    Both smudge, and clean method will be defined under the name of filter we just entered here as config_hide.

  3. Modify .git/config file. You can use your favourite editor or just vi to edit. Enter the following lines into the file.

    [filter "config_hide"]

        smudge = perl -p -e 's/endpoint_url: (?:.+)/endpoint_url: \"https://apps.abzi.co/myproject\"/g'

        clean = perl -p -e 's/endpoint_url: (?:.+)/endpoint_url: \"<your endpoint url here>\"/g'

    What we're trying to do here is that for smudge, we want to inject "https://apps.abzi.co/myproject" into config.js file replacing endpoint_url: ... line by using regular expression from perl command. We decided to use perl because its regular expression syntax and concept are closest to what javascript offers. So we can smoothly transfer our knowledge there.


    Notice the use of (?:.+) . This means non-greedy way of finding any character except newline or line terminator, and with ?: which means we don't care to capture its result back. You can easily find any regular expression resource of javascript to learn more about this in the Internet-verse.

    As well as for clean, we want to replace back to our clean state which matches what config.js has from start which is "<your endpoint url here>".

  4. At this point, make sure you already committed the current stage of your repository. If not yet then execute the following command

    git add -A
    git commit -m "your commit message here"

  5. Now we can try our smudge and clean (filter) by firstly remove such file from git tree

    rm config.js

  6. Then checkout config.js file again

    git checkout config.js

  7. You will notice that the content of config.js has changed to what we expect which includes our working configuration of endpoint_url = "https://apps.abzi.co/myproject". As well if you check the status of git via git status , you will see that it's clean!

  8. You can test the project locally which has end-point URL point to what we want it to be.

  9. Done! You are free to continue developing your project and test locally at the same time without polluting or making changes to git repository itself. This is clean :)

    What's about Scripting ?

We can as well adapt from example above but not just executing a one line of perl command to substitute text within the file, we could creating our own script.

Remember that git will send content of target file as input to our command or script via STDIN (standard input). So if you create a customized script, you need to read input from STDIN.

So we could change content of .git/config to be as follows

[filter "config_hide"]

    smudge = ./_scripts/config_hide.sh smudge

    clean = ./_scripts/config_hide.sh clean

Above, we just let Git knows to execute config_hide.sh script and at the same time we send a parameter either smudge or clean to let our script knows what kind of operation it is.

Note that _scripts/config_hide.sh is not in our git tree. Thus you need to put it inside .gitignore. Or better you can create such file outside of git repository so you won't have to modify .gitignore file to suit this purpose.

I included the initial working script that will read all input from STDIN for you, so you can adapt more, and further work on from that as follows.

#!/bin/bash

# read all lines from input

n=0;

INPUT=""

IFS=$'\n'

while read line || [ -n "$line" ] ; do

    let n=n+1

    if [ $n == 1 ]; then

        INPUT="$line"

    else

        INPUT="$INPUT\n$line"

    fi

done

if [ "$1" == "smudge" ]; then

    # smudge

    # do something here ...

else

    # clean

    # do something here ...

fi


From above script, "$INPUT" is your input content to work with. So you can do echo -e "$INPUT" | perl -p -e 's/something/tosomething/g' or more complex to chain substitution in regular expression i.e. perl -p -e 's/something/tosomething/g; s/something2/tosomething2/g' . Or even more complex as you might imagine, because with script, it's your control and more flexible than one line command line.



That's it!

If you found it useful, please give a thumb up for this post, or comment if you got stuck or had something to say.

Thanks for reading and see you in the next article.