diff --git a/README.md b/README.md
index 4b001ed2c..a76393047 100644
--- a/README.md
+++ b/README.md
@@ -37,7 +37,7 @@ A provider can also be just a pointer to a remote REST service -- for example, c
 
 ## Llama Stack Distribution
 
-A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by inline code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.
+A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by local code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.
 
 
 ## Installation
diff --git a/docs/cli_reference.md b/docs/cli_reference.md
index 07effdbbb..0aab7893e 100644
--- a/docs/cli_reference.md
+++ b/docs/cli_reference.md
@@ -208,7 +208,7 @@ $ llama distribution list
 +---------------+---------------------------------------------+----------------------------------------------------------------------+
 | Spec ID       | ProviderSpecs                               | Description                                                          |
 +---------------+---------------------------------------------+----------------------------------------------------------------------+
-| inline        | {                                           | Use code from `llama_toolchain` itself to serve all llama stack APIs |
+| local         | {                                           | Use code from `llama_toolchain` itself to serve all llama stack APIs |
 |               |   "inference": "meta-reference",            |                                                                      |
 |               |   "safety": "meta-reference",               |                                                                      |
 |               |   "agentic_system": "meta-reference"        |                                                                      |
@@ -220,7 +220,7 @@ $ llama distribution list
 |               |   "agentic_system": "agentic_system-remote" |                                                                      |
 |               | }                                           |                                                                      |
 +---------------+---------------------------------------------+----------------------------------------------------------------------+
-| ollama-inline | {                                           | Like local-source, but use ollama for running LLM inference          |
+| local-ollama  | {                                           | Like local, but use ollama for running LLM inference                 |
 |               |   "inference": "meta-ollama",               |                                                                      |
 |               |   "safety": "meta-reference",               |                                                                      |
 |               |   "agentic_system": "meta-reference"        |                                                                      |
@@ -229,16 +229,16 @@ $ llama distribution list
 
 ```
 
-As you can see above, each “spec” details the “providers” that make up that spec. For eg. The inline uses the “meta-reference” provider for inference while the ollama-inline relies on a different provider ( ollama ) for inference.
+As you can see above, each “spec” details the “providers” that make up that spec. For eg. The `local` spec uses the “meta-reference” provider for inference while the `local-ollama` spec relies on a different provider ( ollama ) for inference.
 
-Lets install the fully local implementation of the llama-stack – named `inline` above.
+Lets install the fully local implementation of the llama-stack – named `local` above.
 
 To install a distro, we run a simple command providing 2 inputs –
 - **Spec Id** of the distribution that we want to install ( as obtained from the list command )
 - A **Name** by which this installation will be known locally.
 
 ```
-llama distribution install --spec inline --name inline_llama_8b
+llama distribution install --spec local --name local_llama_8b
 ```
 
 This will create a new conda environment (name can be passed optionally) and install dependencies (via pip) as required by the distro.
@@ -246,12 +246,12 @@ This will create a new conda environment (name can be passed optionally) and ins
 Once it runs successfully , you should see some outputs in the form
 
 ```
-$ llama distribution install --spec inline --name inline_llama_8b
+$ llama distribution install --spec local --name local_llama_8b
 ....
 ....
 Successfully installed cfgv-3.4.0 distlib-0.3.8 identify-2.6.0 libcst-1.4.0 llama_toolchain-0.0.2 moreorless-0.4.0 nodeenv-1.9.1 pre-commit-3.8.0 stdlibs-2024.5.15 toml-0.10.2 tomlkit-0.13.0 trailrunner-1.4.0 ufmt-2.7.0 usort-1.0.8 virtualenv-20.26.3
 
-Distribution `inline_llama_8b` (with spec inline) has been installed successfully!
+Distribution `local_llama_8b` (with spec local) has been installed successfully!
 ```
 
 Next step is to configure the distribution that you just installed. We provide a simple CLI tool to enable simple configuration.
@@ -260,12 +260,12 @@ It will ask for some details like model name, paths to models, etc.
 
 NOTE: You will have to download the models if not done already. Follow instructions here on how to download using the llama cli
 ```
-llama distribution configure --name inline_llama_8b
+llama distribution configure --name local_llama_8b
 ```
 
 Here is an example screenshot of how the cli will guide you to fill the configuration
 ```
-$ llama distribution configure --name inline_llama_8b
+$ llama distribution configure --name local_llama_8b
 
 Configuring API surface: inference
 Enter value for model (required): Meta-Llama3.1-8B-Instruct
@@ -278,7 +278,7 @@ Do you want to configure llama_guard_shield? (y/n): n
 Do you want to configure prompt_guard_shield? (y/n): n
 Configuring API surface: agentic_system
 
-YAML configuration has been written to ~/.llama/distributions/inline0/config.yaml
+YAML configuration has been written to ~/.llama/distributions/local0/config.yaml
 ```
 
 As you can see, we did basic configuration above and configured inference to run on model Meta-Llama3.1-8B-Instruct ( obtained from the llama model list command ).
@@ -290,12 +290,12 @@ For how these configurations are stored as yaml, checkout the file printed at th
 
 Now let’s start the distribution using the cli.
 ```
-llama distribution start --name inline_llama_8b --port 5000
+llama distribution start --name local_llama_8b --port 5000
 ```
 You should see the distribution start and print the APIs that it is supporting,
 
 ```
-$ llama distribution start --name inline_llama_8b --port 5000
+$ llama distribution start --name local_llama_8b --port 5000
 
 > initializing model parallel with size 1
 > initializing ddp with size 1
@@ -329,7 +329,7 @@ Lets test with a client
 
 ```
 cd /path/to/llama-toolchain
-conda activate <env-for-distro> # ( Eg. local_inline in above example )
+conda activate <env-for-distribution> # ( Eg. local_llama_8b in above example )
 
 python -m  llama_toolchain.inference.client localhost 5000
 ```
diff --git a/llama_toolchain/cli/distribution/install.py b/llama_toolchain/cli/distribution/install.py
index 68d42938d..a056dba36 100644
--- a/llama_toolchain/cli/distribution/install.py
+++ b/llama_toolchain/cli/distribution/install.py
@@ -36,7 +36,7 @@ class DistributionInstall(Subcommand):
         self.parser.add_argument(
             "--spec",
             type=str,
-            help="Distribution spec to install (try ollama-inline)",
+            help="Distribution spec to install (try local-ollama)",
             required=True,
             choices=[d.spec_id for d in available_distribution_specs()],
         )
diff --git a/llama_toolchain/data/default_inference_config.yaml b/llama_toolchain/data/default_inference_config.yaml
deleted file mode 100644
index eda4c9b47..000000000
--- a/llama_toolchain/data/default_inference_config.yaml
+++ /dev/null
@@ -1,14 +0,0 @@
-inference_config:
-  impl_config:
-    impl_type: "inline"
-    checkpoint_config:
-      checkpoint:
-        checkpoint_type: "pytorch"
-        checkpoint_dir: {checkpoint_dir}/
-        tokenizer_path: {checkpoint_dir}/tokenizer.model
-        model_parallel_size: {model_parallel_size}
-        quantization_format: bf16
-    quantization: null
-    torch_seed: null
-    max_seq_len: 16384
-    max_batch_size: 1
diff --git a/llama_toolchain/distribution/install_distribution.sh b/llama_toolchain/distribution/install_distribution.sh
index f0c66c99f..6ae74392c 100755
--- a/llama_toolchain/distribution/install_distribution.sh
+++ b/llama_toolchain/distribution/install_distribution.sh
@@ -96,7 +96,7 @@ ensure_conda_env_python310() {
 
 if [ "$#" -ne 3 ]; then
   echo "Usage: $0 <environment_name> <distribution_name> <pip_dependencies>" >&2
-  echo "Example: $0 my_env local-inline 'numpy pandas scipy'" >&2
+  echo "Example: $0 my_env local-llama-8b 'numpy pandas scipy'" >&2
   exit 1
 fi
 
diff --git a/llama_toolchain/distribution/registry.py b/llama_toolchain/distribution/registry.py
index a60b3cd4f..b208abf9c 100644
--- a/llama_toolchain/distribution/registry.py
+++ b/llama_toolchain/distribution/registry.py
@@ -28,7 +28,7 @@ def available_distribution_specs() -> List[DistributionSpec]:
     providers = api_providers()
     return [
         DistributionSpec(
-            spec_id="inline",
+            spec_id="local",
             description="Use code from `llama_toolchain` itself to serve all llama stack APIs",
             provider_specs={
                 Api.inference: providers[Api.inference]["meta-reference"],
@@ -42,8 +42,8 @@ def available_distribution_specs() -> List[DistributionSpec]:
             provider_specs={x: remote_spec(x) for x in providers},
         ),
         DistributionSpec(
-            spec_id="ollama-inline",
-            description="Like local-source, but use ollama for running LLM inference",
+            spec_id="local-ollama",
+            description="Like local, but use ollama for running LLM inference",
             provider_specs={
                 Api.inference: providers[Api.inference]["meta-ollama"],
                 Api.safety: providers[Api.safety]["meta-reference"],