Local Links

External Links

Contact

Search this site

File Paths on OS X and Windows, Shell Escaping, and how it relates to REALbasic


There is a lot of confusion about how paths to files (and folders) are to be used by a programmer, especially those using REAL Studio.

I will try to shed some light on this.

1.  Paths on OS X

Due to the history of OS X once being a totally different OS, there are two standard formats for specifying a complete path as a string:

  1. Carbon format, i.e. old style: Starting with the volume name, folder and the final file or folder are appended by separating them with a colon (":")
  2. POSIX format, also commonly known as the Unix format: The forward slash ("/") is used as a separator, and there's no notion of volumes. Instead, OS X has the convention of placing all user-mounted volumes into a folder named "Volumes" at the root level.

An sample Carbon path: My External Disk:a.txt
The same on the POSIX side: /Volumes/My External Disk/a.txt

Which format is used depends on which OS functions are used to refer to the file system. In any case, the only reserved character for a path is the ":" or the "/", respectively. It's possible to have any control character in a path, including a NUL-character, at least on the lowest level of the file system. It's still good practise not to use the NUL char in names because it might confused a lot of software, especially BSD (Unix) based, where strings are usually NUL-terminated, thereby messing up their access to paths with such chars in it.

These two special chars (":" and "/") are even interchangeable between the two formats. For instance:

This is a legal Carbon path: /\.txt.
The backslash is legal anyway, and the forward slash is legal for Carbon paths.

The same path would look like this on the POSIX side: :\.txt
Note that the illegal "/" gets replaced with the legal ":" here.

2.  Paths on Windows

Classically, Windows uses two path separators: the backward slash ("\") and the colon (":"). The backslash separates folders, while the colon is used to lead in a drive letter (similar to a volume on OS X). There are actually two ways to define a complete path:

  1. Classic: c:\folder\file
  2. UNC: \\server\share\path\file

Windows defines a few additional reserved chars. See Microsoft's official documentation. Note that it even reserved the Unix style forward slash ("/") and the double quote ("). The latter makes it easier to pass paths as arguments via a command line as you'll see later on, because there's no complicated escaping necessary.

3.  Passing arguments to programs

There is a common misunderstanding about how arguments get passed to command line tools (utilities) on Unix systems (including OS X and Linux). Let me clarify:

When you enter a command with arguments, the utility will receive them as separated arguments, i.e. the arguments are passed as a array of strings, not as one long string.

On Windows, however, this appears to be different: There, the entered command line gets directly passed as a single string to the utility, which then can either parse the line itself or use a OS provided function ("CommandLineToArgv") to have the arguments split up Unix-style. This requires, though, that the invoker of the utility follows the rules for specifying separate arguments, or they won't be split up correctly, e.g. when a single argument contains blanks).

This already marks a big difference in how to invoke a program on OS X vs. Windows:

On OS X, if you pass multiple arguments and use a OS level function to invoke the program (e.g. exec), you can simply add pass them as an array, which is convenient. This also means that these arguments, if they're paths, are meant to be true POSIX paths, i.e. they are not escaped.

On Windows, however, you need to construct a command line which follows the rules for escaping arguments. This rule is quite simple, though: You separate arguments by a blank (space), and you may enclose each argument in double quotes - or must if the argument contains quotes or blanks. In the case the argument contains quotes, each of them needs to be duplicated - this, however, does not matter to passing paths, since they are not allowed to contain quotes. This gives us also a simple rule to use on Windows: Put any path in double quotes, no matter if it contains spaces or not.

4.  Paths in command line shells

This is about using command line terminal, aka shells, such as Terminal.app on OS X or cmd.exe on Windows, to invoke a program (utility) with arguments.

Now that I have clarified that paths, apart from the general rule to put them in quotes on Windows, are passed without escaping to a utility, we get to the part where we enter a path into a command shell.

4.1  The basics on command shell arguments

Since commands usually take more than one argument, there needs to be a way to separate multiple arguments when entering them in a single line.

The common convention for this is to use the blank (space) as a argument separator.

Example:

utility_name a b c
Here the command "utility_name" is invoke with 3 arguments, their values being "a", "b" and "c".

Now imagine you want to pass a path as an argument. Usually this would simply work:

utility_name /folder/a.txt

4.2  Argument escaping

However, what if the path contains a blank? You don't want that to be interpreted as a argument separator by the shell, after all.

In this case, you need to tell the command shell that you mean a path in one piece.

On Windows, simply put it into quotes and you're done. Since file names on Windows can't contain a quote char, there's no further escaping necessary.

On OS X, you have two choices. If the argument only contains blanks but no backslash and no double quote, simply put the entire argument into double quotes. Otherwise you need to escape all special chars in it, which includes the blank (space), the qoute (") and the escape char itself, which is the backslash ("\").

Therefore, to refer to a file whose POSIX path contains just blanks such as:

/My Folder/My File.txt

You'd enter into the command line either:

"/My Folder/My File.txt"

Or:

/My\ Folder/My\ File.txt

However, a path containing a quote or backslash such as:

"test".txt

Can be entered as either:

\"test\".txt

Or:

"\"test\".txt"

So, on OS X, there is no way around escaping when passing paths to a shell. The shell, and not the invoked utility, will then unescape the arguments in the command line and pass them as a "argv" array to the utility.

5.  What this all means to REALbasic programs

5.1  A very short and simple rule you should follow

This is all you need to remember:

Stay away from FolderItem.ShellPath and PathTypeShell

Unless you really know what you get from ShellPath - i.e. an escaped path that's only good for passing to a Shell class.

In general, get the MacOSLib for OS X and use its POSIX path methods if you want to accept paths in your console app or if you want to invoke OS X functions using declares.

On Windows and Linux, use AbsolutePath instead, which even allows you to address a relative paths, something that, oddly, doesn't work with "new FolderItem (path, FolderItem.PathTypeShell)".

5.2  Details on why ShellPath is not smart to use

For a few details, read this bug report: <feedback://showreport?report_id=15958>

Also, read this thread on the NUG mailing list: http://support.realsoftware.com/listarchives/realbasic-nug/2011-02/msg00813.html

6.  Code for your use (for REAL Studio)

  • CommandLineArgs - a Module providing a function to retrieve the cmdline arguments in a GUI (Desktop) application, because the RB-provided "System.CommandLine" isn't useable on OS X. Plus, this returns the arguments split up into an array, just like you get them in the Run event of a Console app. For OS X and Windows (sorry, not solution for Linux available).
  • ResolveNativePath() is a good helper for accepting relative paths in command line arguments. Just pass the path to this function and get back a FolderItem:
Protected Function ResolveNativePath(path As String) As FolderItem
  // This takes a "native" OS path as they get passed to console apps as arguments,
  // i.e. a POSIX path on Mac and Linux, and any "common" path on Windows.
  //
  // This function deals with two special tasks:
  // 1. The passed path may be relative - so it resolves that properly, based on the
  //     SpecialFolder.CurrentWorkingDirectory
  // 2. On OS X, there's no simply way to create a FolderItem from a POSIX path,
  //     so it solves this, too.
  //
  // See also: http://www.tempel.org/RB/FilePaths

  #if TargetWin32

    dim currentDisk as String
    dim currentPath as String = SpecialFolder.CurrentWorkingDirectory.AbsolutePath
    if currentPath.Mid(2,1) = ":" then
      // split the drive:path combo up into drive and path
      currentDisk = currentPath.Left(2)
      currentPath = currentPath.Mid(3)
    end

    if path.Left(2) = "\\" then
      // An UNC path
    elseif path.Left(1) = "\" then
      // An absolute path on the current working drive
      path = currentDisk + path
    elseif path.Mid(2,2) = ":\" then
      // An absolute path with specified drive
    elseif path.Mid(2,1) = ":" then
      // A relative path on the specified drive
      #if false
        // this doesn't work yet
        declare function CurDir lib "kernel32" (drv as CString) as CString
        currentPath = CurDir (path.Left(1))
        path = path.Left(2) + currentPath + path.Mid(3)
      #endif
    else
      // A relative path on the current working drive
      path = currentDisk + currentPath + path
    end

    return GetFolderItem (path, FolderItem.PathTypeAbsolute)

  #elseif TargetMacOS

    // This code converts the path into a Carbon path

    dim segments() as String = path.ReplaceAll(":",Chr(1)).Split("/")
    dim isRelative as Boolean

    if segments.Ubound >= 1 then
      if segments(0)="" and segments(1)="Volumes" then
        segments.Remove 1
        segments.Remove 0
      else
        // An absolute path on the root volume
        segments(0) = Volume(0).Name
      end if
    else
      isRelative = true
    end

    path = Join (segments, ":")
    path = path.ReplaceAll(Chr(1),"/") // converts former legal ":" chars into "/"

    if isRelative then
      dim currPath as String = SpecialFolder.CurrentWorkingDirectory.AbsolutePath
      path = currPath + path
    end

    return GetFolderItem (path, FolderItem.PathTypeAbsolute)

  #elseif TargetLinux

    if path.Left(1) <> "/" then
      // A relative path
      dim currentPath as String = SpecialFolder.CurrentWorkingDirectory.AbsolutePath
      path = currentPath + path
    end

    try
      #pragma BreakOnExceptions off
      return GetFolderItem (path, FolderItem.PathTypeAbsolute)
    catch
      return nil
    end try

  #else

    #error // this is unexpected

  #endif
End Function

Page last modified on 2011-02-16, 19:26 UTC (do)
Powered by PmWiki