diff --git a/linux_intermediate/commandlinetools.rst b/linux_intermediate/commandlinetools.rst index ea2ca632670f7cd07f2121967963167315de9b83..b37f122d15fcc193b9bacba74226e37a3b9c87fa 100644 --- a/linux_intermediate/commandlinetools.rst +++ b/linux_intermediate/commandlinetools.rst @@ -203,6 +203,28 @@ Note the difference: # echo "ACCAAGCATTGGAGGAATATCGTAGGTAAA" | sed 's/A/_/g' _CC__GC_TTGG_GG__T_TCGT_GGT___ +You can use transliteration to replace all instances of a character with another character. +For example, to switch Thymines to Uridines in a sequence: + + :: + # echo "AGTGGCTAAGTCCCTTTAATCAGG" | sed 'y/T/U/' + AGUGGCUAAGTCCCUUUAAUCAGG + +In the pattern specified in the ``sed`` command, each character in the first set is replaced +with the character in the equivalent position in the second set. For example, to get the +reverse transcript of a DNA sequence: + + :: + # echo "AGTGGCTAAGTCCCTTTAATCAGG" | sed 'y/ACGT/UGCA/' + UCACCGAUUCAGGGAAAUUAGUCC + +This is the complementary sequence, but we wanted the reverse complement, so we need to use +the Linux command ``rev`` to reverse the output of the ``sed`` command: + + :: + # echo "AGTGGCTAAGTCCCTTTAATCAGG" | sed 'y/ACGT/UGCA/' | rev + CCUGAUUAAAGGGACUUAGCCACU + When used on a file, sed prints the file to standard output, replacing text as it goes along: