Uri vs. String

We are currently having a big debate internally around the usage of Uri and String in our APIs.

Use Uri not String

URI cannonicalization and escaping is very complex and people always get it wrong, so you must have a single codebase to do the work. Beyond that, converting from each form of a URI can introduce artifacts, so once you resolve (my shorthand for canon, escape, etc.) the URI you can really never go back to a string and try to resolve again (not precisely true, but basically). In Internet Explorer the resolving of URIs was a huge source of security issues, so we can't repeat that mistake in WinFS.

Offer both Uri and String

Using a single codebase for resolving URIs is great, just don't make developers suffer by having to create URIs everywhere. Developers think of URIs as strings, not complex objects.

The pattern should look like:

class MyComponent {
    void SetSomething(string uri);
    void SetSomething(Uri uri);
    Uri GetSomething();
}

The idea being that developers can call either version of the API. The implementation of SetSomething(string) would do nothing more than new up a Uri and call SetSomething(Uri).

Use Uri not String

The problem with the above pattern is that in a component based system you have people calling components through multiple layers. If internal MSFT develoepers (and third party component developers) start calling the string version, we get back into the "string -> uri -> string -> uri" world which introduces loss of data (hence security issues).

Offer both Uri and String

We should just have a rule that says once you convert a Uri to a string, never go back. Then developers on the leaf of a component hierarchy would do the conversion, and it would transfer back.

Use Uri not String

Ahh, that is a great view, however history has shown us that doesn't work. It isn't possible to guarantee that no one converts back to a string, and therefore we will ship a bunch of components with this.

... And the debate continues to rage...

12/13/2003 10:19 AM

I have this debate about the File/FileInfo classes and string paths with my developers on a semi-regular basis. My argument is that using the class (Uri or File) preserves the semantics of the meaning better than some arbitrary string. Not only for the readability/understandability of the API, but also the readability of the method itself.

Wayne Allen | http://weblogs.asp.net/wallen | wayneAT NOSPAMiclemartin dot com

12/13/2003 12:03 PM

At my workplace the problem is that I am working in Java, but with colleagues who recently left a year or two of php after a brief course in Java. They can do code, but the problem is that when they find stuff like complex classes, they convert them to basic types and offer only basic types from their classes. This brings out (1) lack of cohesion, (2) lack of control therefore they generate a lot of bugs which they wouldn't have tought of.

Since Microsoft is a big big company and can afford to do the things the right way, please use URI, export URIs and live happy.

Davide Inglima - limaCAT | http://limacat.blogspot.com | hadesnebulaAT NOSPAMdespammed dot com

12/13/2003 4:19 PM

IMHO, you gotta bite the bullet and go with pure URIs. But, alas, this in and of itself will not solve the problem and introduces other problems. Here are the issues that I've repeatedly run into:

1. Roundtripping -- you use pure URIs -- but then the puppy gets serialized. Serialized into an URL -- you may get URL encoding problems. Serialized into HTML you may get HTML encoding problems. Serialized into a SOAP payload...

For example, the latest WSS simply punts and doesn't allow extended characters to be in filenames (can't upload a file that contains a "{"). I have no problem with this except that previous versions did, the fact that the local OS lets you have these types of characters in the filename and the user can't understand when things break. Furthermore, the form lets you upload them. And the WebDAV redirector lets you think you uploaded them (sometimes...)

2. Combining base and relative paths. I've seen cases, perhaps because of my own stupidity, were you'll end up with "//" in the resulting URI. The URI class simply doesn't care/know because it's actually not got any knowledge (nor should it) of the actual resource provider. So ... I've got lots of defensive code that builds URIs safely and then only ever uses the resulting absolute URI as a string. Encoding problems still possible.

3. Storage ... people (and machines programmed by them) tend to make assumptions -- and thinking you can work with a URI as a string is the basis of this. There are countless examples of components hardcoding their vision of how long an URI can be. I've had to play silly tricks one to many times to encode something.

4. The REST debate -- I fear that attempts to optimize SOAP calls will result in expectations around the length of commands encoded in a URI. This entire area should be nipped in the bud because of the countless examples of how things break when you cross the 256 or 512 byte boundary.

WinFS should clearly state the rules for what encoding is to be used, provide means to encode/decode in the common payload/transport mechanisms (HTTP URL, HTML, XML) , declare explicitly the requirements of parsers with regard to storage needed to process URIs, and perhaps provide warning/trace statements ALWAYS appear when compiling and/or executing the classes throughout the entire beta period, in debug builds, in consumer debug builds, etc. By adopting an admittedly draconian approach, eventually, maybe, enough folks will come to appreciate the nuances and simply follow the best practice. Perhaps this should even be a Longhorn logo requirement (warning significant retro-fitting for MS' own products if taken to this extreme).

phil | http://componentry.com/blogs/phil/index.html | philAT NOSPAMcomponentry dot com

12/13/2003 6:49 PM

In general, i'm of the opinion your method signatures should be as accurate as possible. Your method signature is your "first line of defense" against inproper use of your component.

if you want to specify that something should happen after a certain period of time, you obviously use a timespan object. a numeric type could do the trick, and always has in the past, but its just not as safe as using a timespan; the timespan class more clearly represents your intent, and though requires a bit more typing (i.e., doFooIn(timespan.fromseconds(30)) instead of doFooIn(30)), its safer overall . for example, DoFooIn(-100) isn't very safe.

URI vs. string is the same issue. if your method really expects a URI, then you should pass a URI. Your API itself should answer as many questions as possible; using "strong typed" parameters allows your api to further communicate the intent of your method.

Message.SentTo(Uri target)

is immediately more understandable than:

Message.SendTo(string target)

if im using (string target), shouldn't I expect that Message.Sendto("chrisca208@msn.com") will work? thats a reasonable expectation, but is obviously going to fail if a uri is expected. I have a better development experience if I know that right away.

when using a more specific type for your paramter, you get design time/compile time assistance in validating your input. it relieves a burden from both the component developer and the component consumer. sounds like a win-win. :)

chris | http://objective.mine.nu | chrisca208AT NOSPAMmsn dot com

12/14/2003 1:26 AM

As long as you can create a complete URI of just about any usable type in the constructor, how much of a problem is it to pass in a URI? The handling of data is much cleaner, and as long as the ctor has enough params to be really usable, it isn't really much worse than having to create a string to pass in.

Keith Patrick | mailto:kpatrickAT NOSPAMhouston dot rr dot com | kpatrickAT NOSPAMhouston dot rr dot com

12/14/2003 8:34 AM

I have been a fan of keeping things object-oriented just because it allows greater flexibility. The Uri class is a great class for the purpose of storing URI compliant strings. It allows you to parse it out when needed. However I'm unsure of the performance cost associated with the Uri class if any when compared to a normal string.

I guess I would bite the performance cost if the URI class gave me the functionality that the object or application required.

But the idea of providing methods with both a Uri and String argument is a good one if you need to support Uri but do not want to lock the developer into having to create a Uri for themselves. Let the overloaded method do it.

Of course, you need to think about these things when create API's for other developers to use. The whole concept of API development is that it makes doing something easier.

Always code for the lazy developer that will be using your API.

Adam Weigert | http://weblogs.notevil.net/adam/ | adam [dot] weigert [at] wegmans [dot] com

12/14/2003 12:09 PM

It's always best when an API lets developers do as few stupid things as possible, because inevitably some of them will do things as stupidly as possible. In this light, allowing strings is just asking for trouble.

James Bellinger | mailto:bellinger dot six (use the digit six), atNoSpam osu dot edu | bellinger dot six (use the digit six), atNoSpam osu dot edu

12/14/2003 3:35 PM

I believe that File and Uri classes are neccessary as they take the "Magic" out of strings and elevate them to real objects with real semantic meaning. My only complaint is that the Url and File classes don't offer enough helper functions to dig around in Uris, Urls, etc.

Scott Hanselman | mailto:scottAT NOSPAMhanselman dot com | scottAT NOSPAMhanselman dot com

12/14/2003 9:23 PM

Sounds like you need union types --- can you fake it with generics instead?

template<class T>
void SetSomething(T uri_specifier)

and the compiler should be able to catch it if called with something neither a URI nor a string?

12/14/2003 10:58 PM

Use Uri, the 9 extra characters aren't going to kill anyone. But really many people will be obvious to the security issues - using Uri will hopefully prevent accidental mistakes. Finally, new Uri(Uri.ToString()) shouldn't suffer from any security problems. It would seem this would be good for creating a pit of success.

12/15/2003 6:05 AM

There are many cases when you only want to represent a file path.. The extra validation that Uri gets you doesn't save you much effort if you still have to check that its a valid file Uri for a valid path. I'm in favor of either adding many more helper methods to Uri for dealing with filesystem paths or otherwise making the Path object itself instantiable. The former would cause less API churn as future versions of methods gain the ability to work across transports other than the local filesystem.

Josh Christie | http://www.joshchristie.com/weblog

12/15/2003 8:02 AM


IMO, You need to accept either String or Uri, and should return the Uri.

Might be a little old school, but I expect to name things on my computer. The name is the string of characters a user types. The name isn't the set of properties describing the name.

On a more practical level, this is the most discoverable approach. If you get back a Uri for property Bar, it's easy with Intellisense to see what I can do with a Uri. It's a little hard to discover the "proper" way to create one.

KC

Ken Cowan | mailto:kccowanAT NOSPAMyahoo dot com | kccowanAT NOSPAMyahoo dot com

12/15/2003 8:02 AM

Actually the Uri class can represent a path, and you can create it pretty easily:

new Uri(@"c:\foo.txt")

Or, are you looking for Path.* type methods on URI for cracking the filesystem path?

Chris Anderson | http://blog.simplegeek.com | chris_l_andersonAT NOSPAMhotmail dot com

12/15/2003 9:23 AM

There's one missing:
Accept string/store as string.

Validate in the function that it is a good URI, throw if not valid.

Its just like when you have an Enum type and you do the Enum.IsDefined check in the setter.

JFo

12/15/2003 10:37 AM

JFo: The issue with that approach is that the string doesn't store the information that it is valid. In addition a URI can be in many forms (cannonicalized, escaped, etc.) and again you can't store that in a string.

Chris Anderson | http://blog.simplegeek.com | chris_l_andersonAT NOSPAMhotmail dot com

12/15/2003 2:17 PM

Chris,
I'm looking for the Path.* methods for Uri.. If Uri is indeed going to be the preferred way for dealing with all paths whether they are on the filesystem or some other transport, then users shouldn't be required to do things like Path.GetFileName(Uri.ToString()).. Something like Uri.FileName would be much much clearer.

Josh Christie | http://www.joshchristie.com/weblog

12/15/2003 9:10 PM

I think it should definitely stored as a URI. A URI should, by definition, be validated to be a URI. There must at some point be a safeguard to maintain the resolution of the data, and converting to String potentially reduces it. As far as exposing it as a String, I can see both sides, but I lean towards just exposing things as what they are. WinFX is an OO API, so it should truly represent things with objects. Strings are like the managed version of pointers: everymen for coding shortcuts. But why have the abstraction of classes if you short circuit something that has structure and variety (in a base library, no less) with a string?

Keith Patrick

12/16/2003 1:14 AM

Accept only Uri, and have an explicit convert operator on the Uri object that eats strings. That way methods at worst will be

string bar='http://somewhere";
foo((uri)bar);

which is only another five characters. You _could_ have an implicit operator conversion and then you could do

string bar='http://somewhere";
foo(bar);

But then there's no marker in code that says it *should* be a URI (Because it should, right?) and it'd be great if it actually was one.

Paul Hill | mailto:PaulHill35AT NOSPAMhotmail dot com | PaulHill35AT NOSPAMhotmail dot com

12/16/2003 12:10 PM

Dont buck a good trend.

In C strings were null terminated character arrays, which lead to much abuse. Every C book had a section dedicated to strings arrays and pointers simply because you always had to be aware of what happened under the hood not to hang yourself.
In the managed world string objects took that away most of the pain and life became better.

C strings are ascii strings, Unicode, BSTR's and other such imps were just (painful sometimes) workarounds at best to globalizing strings.
In the managed world strings are Unicode and Encodings handle the grimmier bits; and life became better.

Munging around with file paths could often bite me in corner cases (in any language)
Until I came across the Path.* helpers mentioned above since the happy happy joy joy life became better

Dont buck the trend. If URI's are better than strings and it is communicated effectively in the API's and documentation (eg URI.* helpers, stern security warnings in docs and MS published books) developers will 'get it' and slowly but surely make the switch.

If strings are supplied as alternatives everywhere then the message will be strings ARE URI's use them interchangably. Of course every other joe coder will use just strings and build API's that use just strings (none of that shmancy URI for me). Make life better

Ifeanyi Echeruo

12/16/2003 5:06 PM

forgive the typo's above an 8:30am to 2:30am day will do that to you :P

Ifeanyi Echeruo

12/16/2003 6:07 PM

I posted earlier in the thread ... Ifeanyi makes very clear my past experiences and thus my recommendation for URI only. The URI and Path classes are new and at this point in time they make things much safer and internationalizable.

That said, I know of few APIs, if any, that actually require that they be used -- resulting in their use (I presume from a proof point of myself and a few others) by folks who've been bitten in the butt when shipping code internationally.

This question shouldn't be restricted to the WinFS API set ... it should be for the entire umbrella of Longhorn APIs. Simply using UTF8 and strings is not enough as experience has unfortunately shown one too many times: you only know what you know and it would seem that the string advocates are purely 1033 codepage folks (if they are even aware of codepages).

Another proof point to consider is the Groove APIs (http://groove.net/devzone). Groove pushed the envelope 3+ years ago by clearing differentiating between canonical URIs and bound/bindable URIs. Canonical URIs are "abstract" in that they can't be validated against a specific resource. Bindable URI/URLs tie to specific, persisted, accessible resources - where the "binding" is associated with the identity/account.

In this model (which I can only assume WinFS will have to adopt in some form or another) just because you know the "path" to a "potential" resource, doesn't mean that you have "rights" to access the resource. You need to use bound/bindable URLs to reference a document fragment in a persistent store -- but providing this reference (in UTF8 encoding) to an external agent/program/user is utterly useless as the their "bound" URI/URL will be different because they are a distinct identity. So ... in this model you can expose the canonical URI without regard to whether the resulting attempt to bind to it will succeed. Exposing the bindable URI to anybody else but the originator is bound (pun intended ;) to fail.

It is this last level of detail (which as I understand it is part of the w3c standard with regard to URIs) is where you *have* to get to build robust, distributed, replicating storage systems.

Sorry for the long post...

phil | http://componentry.com/blogs/phil/index.html | philAT NOSPAMcomponentry dot com

12/18/2003 3:36 PM

I think from the majority of the comments it is clear that using an Uri is the preferred approach. Just to add my 2c, when designing an API or framework, I feel that it is important that the API/Framework promotes best practice without the developer having to be concerned about these particular issues. And when it comes to managing Uri’s I believe that most developers are unaware of the issues, and we should be guided by the framework. When working on a project, once the best practices have been identified I proceed to build a layer above the framework that encourages those best practices, and it would be great if the framework were doing that for us already. Of course not at the expense of flexibility, but in the case if the Uri I do not expect that this would be an issue.

Chris Taylor | mailto:chris_taylor_zaAThotmailDOTcom | chris_taylor_zaAThotmailDOTcom

12/18/2003 3:50 PM

I am Mort. I can handle URIs if you tell me it's important.

So, please implement URIs, and please do a damn good job of explaining why they're useful. Evangelize the crap out of them.

If a method requires that I pass a URI to MemberMethod(new URI(string garbage)), I'm capable of doing that.

As long as I can Google or MSDN Search the answer quickly, it's a learn-once pattern that helps everyone.

Tristan | --- | ---

12/24/2003 10:29 PM

There are several reasons why I think it's better to use Uri than String.

1. If Uri is the natural data type that the method will want internally, then there are two options, either the parameter can be of type Uri, or both Uri *and* string. This is because, the third option, string only, would mean that if the caller already has a Uri, then the code will have to convert a second Uri in the object, which is a waste. Hence, either there should be one method that takes a Uri, or two overloads, one that takes Uri and one that takes a string.

2. Fewer overloads make for a cleaner API.

3. It's more type safe for the parameter type to be a Uri. If it were a string, any variable of type string can be passed in. String is too generic of a type.

Kenneth Kasajian | mailto:kenneth dot kasajianAT NOSPAMwonderare dot com | kenneth dot kasajianAT NOSPAMwonderare dot com

12/28/2003 9:20 PM

Its almost like API functions that are written that "WANT" an xml fragment but the method has a string parameter... first glance indicates that the method can accept simple string data that may not even be valid xml, and then a runtime bug is introduced.

Same with URIs, I've started building an Imaging framework for my company that can refer to disparate sources of images (in a database as a BLOB, a local file, an image from the internet) and I've used the URI class to be a lot more specific about the intended parameter, even the local file path is internally Uri.Parsed and exposed back out as a file uri

In summary, use the Uri as a parameter, and I would encourage NOT providing a string parameter overload...

Eric Newton | mailto:ericnewton76AT NOSPAMhotmail dot com | ericnewton76AT NOSPAMhotmail dot com

01/05/2004 1:14 PM

Using Uri instead of string makes the meaning and use clear to me. So I would fall on the side of using Uri vs. string in a method signature. Now if the IDE/compiler is smart enough to give an Uri type special treatment, similar to how it gives string special treatment that would make it more convenient.

For example
Declaration Foo( System.String name )
Usage obj.Foo( @”Jim Bob” );

Declaration Foo( Uri( System.Uri name )
Usage obj.Foo( @”http://msnd.microsoft.com );

Norman Headlam | http://normanheadlam.blogspot.com/ | nheadlamAT NOSPAMcomcast dot net

01/15/2004 9:14 AM

Starting to implement URI as a true resource indicator, however, Uri doesnt have appropriate decorations for COM...

could you guys correct this hopefully with framework 1.2?

I've cheated by subclassing System.Uri and exposing to COM

Eric Newton | mailto:ericnewton76AT NOSPAMhotmail dot com | ericnewton76AT NOSPAMhotmail dot com

06/08/2004 8:07 PM

专利 商标 知识产权

专利 商标 知识产权 | http://www.kangxin.com/ | linkAT NOSPAMkangxin dot com

06/08/2004 10:46 PM

水处理

水处理 | http://www.rotek.com.cn/1024-768.htm | linkAT NOSPAMrotek dot com dot cn

06/08/2004 10:46 PM

小灵通

小灵通 | http://www.bbell.com | linkAT NOSPAMbbell dot com

06/09/2004 6:13 AM

投影机 等离子显示器 大屏幕拼接 会议室工程

投影机 等离子显示器 | http://www.glory-vision.com | linkAT NOSPAMglory-vision dot com

06/09/2004 6:14 AM

读卡器

读卡器 | http://www.longsuncard.com/chanp/gkduka.htm | linkAT NOSPAMlongsuncard dot com

06/09/2004 6:15 AM

考勤

考勤 | http://www.longsuncard.com/chanp/kqin.htm | linkAT NOSPAMlongsuncard dot com

06/09/2004 6:15 AM

读卡器

读卡器 | http://www.longsuncard.com/chanp/gkduka.htm | linkAT NOSPAMlongsuncard dot com

06/09/2004 6:52 PM

英语 英语培训 英语考试

英语 | http://www.bjerwai.com/modules/peixun/english.htm | linkAT NOSPAMbjerwai dot com

06/10/2004 10:51 PM

律师 法律咨询 律师事务所

律师 法律咨询 | http://www.zt148.com | linkAT NOSPAMzt148 dot com

06/10/2004 10:51 PM

oa 网上办公 办公自动化

oa 网上办公 办公自动化 | http://www.gotooa.com | linkAT NOSPAMgotooa dot com

06/10/2004 10:59 PM

视频编解码器 光纤收发器 视频光端机

视频编解码器 光纤收发器 | http://www.gyhx.com | linkAT NOSPAMgyhx dot com

06/14/2004 5:21 AM

实验室设备 防静电地板

防静电地板 实验室设备 | http://www.anchuang.com.cn | linkAT NOSPAManchuang dot com dot cn

06/14/2004 8:00 AM

发电机

发电机 | http://www.zgpt.cn/pro1.htm | linkAT NOSPAMzgpt dot cn

06/14/2004 10:09 PM

进口轴承 轴承

进口轴承 | http://www.skf-baijia.com/products.htm | linkAT NOSPAMskf-baijia dot com

06/15/2004 2:21 AM

视频会议 电话会议 polycom 电话会议

视频会议 电话会议 polycom 电话会议 | http://www.plcm.com.cn/defaultflash.asp | linkAT NOSPAMplcm dot com dot cn

06/15/2004 4:35 AM

商标 专利 专利代理 商标代理

商标 专利 | http://www.bjkhp.com | linkAT NOSPAMbjkhp dot com

06/15/2004 5:26 AM

美容 整形 整容

美容 整形 整容 | http://www.51zhengxing.net | linkAT NOSPAM51zhengxing dot net

06/15/2004 6:58 AM

会计

会计 | http://www.cfeenet.com/Examzhichen/default.asp | linkAT NOSPAMcfeenet dot com

06/15/2004 7:37 PM

健身器材 跑步机 按摩椅

健身器材 跑步机 按摩椅 | http://www.ruilong.com.cn | linkAT NOSPAMruilong dot com dot cn

08/11/2004 11:32 PM

[http://www.dmoz.net.cn 网址大全]
[http://www.86dmoz.com 精品网址]
[http://www.kamun.com 免费电影]
[http://movie.kamun.com 电影下载]
[http://music.kamun.com 免费MP3下载]
[http://www.pc530.net 电脑爱好者]
[http://www.5icc.com 手机短信铃声彩信下载]
[http://www.dianyingxiazai.com 电影下载]
[http://www.yinyuexiazai.com 音乐下载]

免费电影下载 | http://www.kamun.com/ | fsdfAT NOSPAMfdsf dot com

Add New

Name

Email

Homepage

Security Word

Type in the security Word

Content (HTML not allowed)