Class StringUtil


  • public class StringUtil
    extends Object
    Grab-bag of stateless String-oriented utilities.
    • Constructor Detail

      • StringUtil

        public StringUtil()
    • Method Detail

      • join

        public static <T> String join​(String separator,
                                      Collection<T> objs)
        Parameters:
        separator - String to interject between each string in strings arg
        objs - List of objs to be joined
        Returns:
        String that concatenates the result of each item's to String method for all items in objs, with separator between each of them.
      • join

        public static <T> String join​(String separator,
                                      T... objs)
      • split

        public static int split​(String aString,
                                String[] tokens,
                                char delim)
        Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that if tokens arg is not large enough to all the tokens in the string, excess tokens are discarded.
        Parameters:
        aString - the string to split
        tokens - an array to hold the parsed tokens
        delim - character that delimits tokens
        Returns:
        the number of tokens parsed
      • splitConcatenateExcessTokens

        public static int splitConcatenateExcessTokens​(String aString,
                                                       String[] tokens,
                                                       char delim)
        Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that the string is split into no more elements than tokens arg will hold, so the final tokenized element may contain delimiter chars.
        Parameters:
        aString - the string to split
        tokens - an array to hold the parsed tokens
        delim - character that delimits tokens
        Returns:
        the number of tokens parsed
      • toLowerCase

        public static byte toLowerCase​(byte b)
        Parameters:
        b - ASCII character
        Returns:
        lowercase version of arg if it was uppercase, otherwise returns arg
      • toUpperCase

        public static byte toUpperCase​(byte b)
        Parameters:
        b - ASCII character
        Returns:
        uppercase version of arg if it was lowercase, otherwise returns arg
      • toUpperCase

        public static void toUpperCase​(byte[] bytes)
        Converts in place all lower case letters to upper case in the byte array provided.
      • assertCharactersNotInString

        public static String assertCharactersNotInString​(String illegalChars,
                                                         char... chars)
        Checks that a String doesn't contain one or more characters of interest.
        Parameters:
        illegalChars - the String to check
        chars - the characters to check for
        Returns:
        String the input String for convenience
        Throws:
        IllegalArgumentException - if the String contains one or more of the characters
      • wordWrap

        public static String wordWrap​(String s,
                                      int maxLineLength)
        Return input string with newlines inserted to ensure that all lines have length <= maxLineLength. if a word is too long, it is simply broken at maxLineLength. Does not handle tabs intelligently (due to implementer laziness).
      • wordWrapSingleLine

        public static String wordWrapSingleLine​(String s,
                                                int maxLineLength)
      • intValuesToString

        public static String intValuesToString​(int[] intVals)
      • intValuesToString

        public static String intValuesToString​(short[] shortVals)
      • bytesToString

        public static String bytesToString​(byte[] data)
      • bytesToString

        public static String bytesToString​(byte[] buffer,
                                           int offset,
                                           int length)
      • stringToBytes

        public static byte[] stringToBytes​(String s)
      • stringToBytes

        public static byte[] stringToBytes​(String s,
                                           int offset,
                                           int length)
      • readNullTerminatedString

        public static String readNullTerminatedString​(BinaryCodec binaryCodec)
      • charsToBytes

        public static void charsToBytes​(char[] chars,
                                        int charOffset,
                                        int length,
                                        byte[] bytes,
                                        int byteOffset)
        Convert chars to bytes merely by casting
        Parameters:
        chars - input chars
        charOffset - where to start converting from chars array
        length - how many chars to convert
        bytes - where to put the converted output
        byteOffset - where to start writing the converted output.
      • charToByte

        public static byte charToByte​(char c)
        Convert ASCII char to byte.
      • byteToChar

        public static char byteToChar​(byte b)
        Convert ASCII byte to ASCII char.
      • bytesToHexString

        public static String bytesToHexString​(byte[] data)
        Convert a byte array into a String hex representation.
        Parameters:
        data - Input to be converted.
        Returns:
        String twice as long as data.length with hex representation of data.
      • hexStringToBytes

        public static byte[] hexStringToBytes​(String s)
                                       throws NumberFormatException
        Convert a String containing hex characters into an array of bytes with the binary representation of the hex string
        Parameters:
        s - Hex string. Length must be even because each pair of hex chars is converted into a byte.
        Returns:
        byte array with binary representation of hex string.
        Throws:
        NumberFormatException
      • toHexDigit

        public static char toHexDigit​(int value)
      • reverseString

        public static String reverseString​(String s)
        Reverse the given string. Does not check for null.
        Parameters:
        s - String to be reversed.
        Returns:
        New string that is the reverse of the input string.
      • isBlank

        public static boolean isBlank​(String str)

        Checks if a String is whitespace, empty ("") or null.

         StringUtils.isBlank(null)      = true
         StringUtils.isBlank("")        = true
         StringUtils.isBlank(" ")       = true
         StringUtils.isBlank("sam")     = false
         StringUtils.isBlank("  sam  ") = false
         
        Parameters:
        str - the String to check, may be null
        Returns:
        true if the String is null, empty or whitespace
      • repeatCharNTimes

        public static String repeatCharNTimes​(char c,
                                              int repeatNumber)
      • asEmptyIfNull

        public static String asEmptyIfNull​(Object string)
      • levenshteinDistance

        public static int levenshteinDistance​(String string1,
                                              String string2,
                                              int swap,
                                              int substitution,
                                              int insertion,
                                              int deletion)
      • hammingDistance

        public static int hammingDistance​(String s1,
                                          String s2)
        Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.
        Parameters:
        s1 - The first string to compare
        s2 - The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.
        Returns:
        Hamming distance between s1 and s2.
        Throws:
        IllegalArgumentException - If the two strings have differing lengths.
      • isWithinHammingDistance

        public static boolean isWithinHammingDistance​(String s1,
                                                      String s2,
                                                      int maxHammingDistance)
        Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.
        Parameters:
        s1 - The first string to compare
        s2 - The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.
        maxHammingDistance - The largest Hamming distance the strings can have for this function to return true.
        Returns:
        true if the two strings are within maxHammingDistance of each other, false otherwise.
        Throws:
        IllegalArgumentException - If the two strings have differing lengths.
      • humanReadableByteCount

        public static String humanReadableByteCount​(long bytes)
        Takes a long value representing the number of bytes and produces a human readable byte count.
        Parameters:
        bytes - The number of bytes to create a human readable string for.
        Returns:
        A human readable string of the number of bytes given.