By this point, many users already have questions about sessions.
This section presents a mini-FAQ (note that we have also a real FAQ)
of the most basic issues one is presented with when using a Session.
Just one time, somewhere in your application’s global scope. It should be
looked upon as part of your application’s configuration. If your
application has three .py files in a package, you could, for example,
place the sessionmaker line in your __init__.py file; from
that point on your other modules say “from mypackage import Session”. That
way, everyone else just uses Session(),
and the configuration of that session is controlled by that central point.
If your application starts up, does imports, but does not know what
database it’s going to be connecting to, you can bind the
Session at the “class” level to the
engine later on, using sessionmaker.configure().
In the examples in this section, we will frequently show the
sessionmaker being created right above the line where we actually
invoke Session. But that’s just for
example’s sake! In reality, the sessionmaker would be somewhere
at the module level. The calls to instantiate Session
would then be placed at the point in the application where database
conversations begin.
When do I construct a Session, when do I commit it, and when do I close it?
tl;dr;
As a general rule, keep the lifecycle of the session separate and
external from functions and objects that access and/or manipulate
database data. This will greatly help with achieving a predictable
and consistent transactional scope.
Make sure you have a clear notion of where transactions
begin and end, and keep transactions short, meaning, they end
at the series of a sequence of operations, instead of being held
open indefinitely.
A Session is typically constructed at the beginning of a logical
operation where database access is potentially anticipated.
The Session, whenever it is used to talk to the database,
begins a database transaction as soon as it starts communicating.
Assuming the autocommit flag is left at its recommended default
of False, this transaction remains in progress until the Session
is rolled back, committed, or closed. The Session will
begin a new transaction if it is used again, subsequent to the previous
transaction ending; from this it follows that the Session
is capable of having a lifespan across many transactions, though only
one at a time. We refer to these two concepts as transaction scope
and session scope.
The implication here is that the SQLAlchemy ORM is encouraging the
developer to establish these two scopes in their application,
including not only when the scopes begin and end, but also the
expanse of those scopes, for example should a single
Session instance be local to the execution flow within a
function or method, should it be a global object used by the
entire application, or somewhere in between these two.
The burden placed on the developer to determine this scope is one
area where the SQLAlchemy ORM necessarily has a strong opinion
about how the database should be used. The unit of work pattern
is specifically one of accumulating changes over time and flushing
them periodically, keeping in-memory state in sync with what’s
known to be present in a local transaction. This pattern is only
effective when meaningful transaction scopes are in place.
It’s usually not very hard to determine the best points at which
to begin and end the scope of a Session, though the wide
variety of application architectures possible can introduce
challenging situations.
A common choice is to tear down the Session at the same
time the transaction ends, meaning the transaction and session scopes
are the same. This is a great choice to start out with as it
removes the need to consider session scope as separate from transaction
scope.
While there’s no one-size-fits-all recommendation for how transaction
scope should be determined, there are common patterns. Especially
if one is writing a web application, the choice is pretty much established.
A web application is the easiest case because such an application is already
constructed around a single, consistent scope - this is the request,
which represents an incoming request from a browser, the processing
of that request to formulate a response, and finally the delivery of that
response back to the client. Integrating web applications with the
Session is then the straightforward task of linking the
scope of the Session to that of the request. The Session
can be established as the request begins, or using a lazy initialization
pattern which establishes one as soon as it is needed. The request
then proceeds, with some system in place where application logic can access
the current Session in a manner associated with how the actual
request object is accessed. As the request ends, the Session
is torn down as well, usually through the usage of event hooks provided
by the web framework. The transaction used by the Session
may also be committed at this point, or alternatively the application may
opt for an explicit commit pattern, only committing for those requests
where one is warranted, but still always tearing down the Session
unconditionally at the end.
Some web frameworks include infrastructure to assist in the task
of aligning the lifespan of a Session with that of a web request.
This includes products such as Flask-SQLAlchemy,
for usage in conjunction with the Flask web framework,
and Zope-SQLAlchemy,
typically used with the Pyramid framework.
SQLAlchemy recommends that these products be used as available.
In those situations where the integration libraries are not
provided or are insufficient, SQLAlchemy includes its own “helper” class known as
scoped_session. A tutorial on the usage of this object
is at Contextual/Thread-local Sessions. It provides both a quick way
to associate a Session with the current thread, as well as
patterns to associate Session objects with other kinds of
scopes.
As mentioned before, for non-web applications there is no one clear
pattern, as applications themselves don’t have just one pattern
of architecture. The best strategy is to attempt to demarcate
“operations”, points at which a particular thread begins to perform
a series of operations for some period of time, which can be committed
at the end. Some examples:
A background daemon which spawns off child forks
would want to create a Session local to each child
process, work with that Session through the life of the “job”
that the fork is handling, then tear it down when the job is completed.
For a command-line script, the application would create a single, global
Session that is established when the program begins to do its
work, and commits it right as the program is completing its task.
For a GUI interface-driven application, the scope of the Session
may best be within the scope of a user-generated event, such as a button
push. Or, the scope may correspond to explicit user interaction, such as
the user “opening” a series of records, then “saving” them.
As a general rule, the application should manage the lifecycle of the
session externally to functions that deal with specific data. This is a
fundamental separation of concerns which keeps data-specific operations
agnostic of the context in which they access and manipulate that data.
E.g. don’t do this:
### this is the **wrong way to do it** ###
class ThingOne(object):
def go(self):
session = Session()
try:
session.query(FooBar).update({"x": 5})
session.commit()
except:
session.rollback()
raise
class ThingTwo(object):
def go(self):
session = Session()
try:
session.query(Widget).update({"q": 18})
session.commit()
except:
session.rollback()
raise
def run_my_program():
ThingOne().go()
ThingTwo().go()
Keep the lifecycle of the session (and usually the transaction)
separate and external:
### this is a **better** (but not the only) way to do it ###
class ThingOne(object):
def go(self, session):
session.query(FooBar).update({"x": 5})
class ThingTwo(object):
def go(self, session):
session.query(Widget).update({"q": 18})
def run_my_program():
session = Session()
try:
ThingOne().go(session)
ThingTwo().go(session)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
The most comprehensive approach, recommended for more substantial applications,
will try to keep the details of session, transaction and exception management
as far as possible from the details of the program doing its work. For
example, we can further separate concerns using a context manager:
### another way (but again *not the only way*) to do it ###
from contextlib import contextmanager
@contextmanager
def session_scope():
"""Provide a transactional scope around a series of operations."""
session = Session()
try:
yield session
session.commit()
except:
session.rollback()
raise
finally:
session.close()
def run_my_program():
with session_scope() as session:
ThingOne().go(session)
ThingTwo().go(session)
Is the Session a cache?
Yeee…no. It’s somewhat used as a cache, in that it implements the
identity map pattern, and stores objects keyed to their primary key.
However, it doesn’t do any kind of query caching. This means, if you say
session.query(Foo).filter_by(name='bar'), even if Foo(name='bar')
is right there, in the identity map, the session has no idea about that.
It has to issue SQL to the database, get the rows back, and then when it
sees the primary key in the row, then it can look in the local identity
map and see that the object is already there. It’s only when you say
query.get({some primary key}) that the
Session doesn’t have to issue a query.
Additionally, the Session stores object instances using a weak reference
by default. This also defeats the purpose of using the Session as a cache.
The Session is not designed to be a
global object from which everyone consults as a “registry” of objects.
That’s more the job of a second level cache. SQLAlchemy provides
a pattern for implementing second level caching using dogpile.cache,
via the Dogpile Caching example.
Is the session thread-safe?
The Session is very much intended to be used in a
non-concurrent fashion, which usually means in only one thread at a
time.
The Session should be used in such a way that one
instance exists for a single series of operations within a single
transaction. One expedient way to get this effect is by associating
a Session with the current thread (see Contextual/Thread-local Sessions
for background). Another is to use a pattern
where the Session is passed between functions and is otherwise
not shared with other threads.
The bigger point is that you should not want to use the session
with multiple concurrent threads. That would be like having everyone at a
restaurant all eat from the same plate. The session is a local “workspace”
that you use for a specific set of tasks; you don’t want to, or need to,
share that session with other threads who are doing some other task.
Making sure the Session is only used in a single concurrent thread at a time
is called a “share nothing” approach to concurrency. But actually, not
sharing the Session implies a more significant pattern; it
means not just the Session object itself, but
also all objects that are associated with that Session, must be kept within
the scope of a single concurrent thread. The set of mapped
objects associated with a Session are essentially proxies for data
within database rows accessed over a database connection, and so just like
the Session itself, the whole
set of objects is really just a large-scale proxy for a database connection
(or connections). Ultimately, it’s mostly the DBAPI connection itself that
we’re keeping away from concurrent access; but since the Session
and all the objects associated with it are all proxies for that DBAPI connection,
the entire graph is essentially not safe for concurrent access.
If there are in fact multiple threads participating
in the same task, then you may consider sharing the session and its objects between
those threads; however, in this extremely unusual scenario the application would
need to ensure that a proper locking scheme is implemented so that there isn’t
concurrent access to the Session or its state. A more common approach
to this situation is to maintain a single Session per concurrent thread,
but to instead copy objects from one Session to another, often
using the Session.merge() method to copy the state of an object into
a new object local to a different Session.